indexia package¶
indexia.eidola module¶
Make sample creators & creatures.
- class indexia.eidola.Maker(test_db, species_per_genus, num_beings, trait)¶
Bases:
objectFashion any number of creators & creatures for testing.
- get()¶
Get test data.
- Returns:
fathers (list(pandas.DataFrame)) – List containing a single dataframe of creator data.
sons (list(pandas.DataFrame)) – List containing species_per_genus dataframes of creature data.
grandsons (list(pandas.DataFrame)) – List containing (species_per_genus)^2 dataframes of creature data.
great_grandsons (list(pandas.DataFrame)) – List containing (species_per_genus)^3 dataframes of creature data.
- make()¶
Make test data.
Makes a single creator table & 3 generations of creature tables. Each generation has species_per_genus tables, with num_beings creatures in each table. The creators & creatures all have the attribute trait.
- Returns:
fathers (list(pandas.DataFrame)) – List containing a single dataframe of creator data.
sons (list(pandas.DataFrame)) – List containing species_per_genus dataframes of creature data.
grandsons (list(pandas.DataFrame)) – List containing (species_per_genus)^2 dataframes of creature data.
great_grandsons (list(pandas.DataFrame)) – List containing (species_per_genus)^3 dataframes of creature data.
- make_creators(ix, cnxn, genus)¶
Make creator beings.
The number of beings specified by num_beings will be created in a table named genus.
- Parameters:
ix (indexia.indexia.Indexia) – An Indexia instance.
cnxn (sqlite3.Connection) – A database connection.
genus (str) – Name of the creator (parent) table.
- Returns:
creators – Dataframe of creator data.
- Return type:
list(pandas.DataFrame)
- make_creatures(ix, cnxn, genus, species)¶
Make sample creatures with a given genus & species.
A creature table named species is created, & num_beings creature records are added to the table. Each creature has the attribute trait.
- Parameters:
ix (indexia.indexia.Indexia) – An Indexia instance.
cnxn (sqlite3.Connection) – A database connection.
genus (str) – Name of the creator (parent) table.
species (str) – Name of the creature (child) table.
- Returns:
creatures – Dataframe of creature data.
- Return type:
pandas.DataFrame
- make_species(ix, cnxn, genus, species_prefix)¶
Make one or more species of a given genus.
The number of species created for in the genus is given by species_per_genus. For each species created, num_beings creature records are added to the species, each having the attribute trait.
- Parameters:
ix (indexia.indexia.Indexia) – An Indexia instance.
cnxn (sqlite3.Connection) – A database connection.
genus (str) – Name of the creator (parent) table.
species_prefix (str) – Prefix of the creature (child) table names.
species (str) – Name of the creature (child) table.
- Returns:
species – List of dataframes containing creature data.
- Return type:
list(pandas.DataFrame)
- class indexia.eidola.Templates(db)¶
Bases:
objectCreate template indexia objects.
- build_template(template_name)¶
Create objects for the given template.
- Parameters:
template_name (str) – Name of the template to build.
- Raises:
ValueError – If template_name is not a valid template, raise a ValueError.
- Returns:
objects – List of tuples containing table names & object data.
- Return type:
list(tuple(string, pandas.DataFrame))
- show_templates()¶
Show available templates
- Returns:
templates – Dictionary of available template names & table structures.
- Return type:
dict
indexia.indexia module¶
Defines core operations on indexia objects.
- class indexia.indexia.Indexia(db=None)¶
Bases:
objectCore class for creating, modifying, & retrieving indexia objects.
- add_creator(cnxn, genus, trait, expr)¶
Get or create a creator entity.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
genus (str) – Name of the creator (parent) table to be retrieved or created.
trait (str) – Name of the creator’s text attribute.
expr (str) – Value of the creator’s text attribute.
- Returns:
creator – A single-row dataframe of creator entity data.
- Return type:
pandas.DataFrame
- add_creature(cnxn, genus, creator, species, trait, expr)¶
Get or create a creature of a given creator.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
genus (str) – Name of the creator (parent) table.
creator (pandas.DataFrame) – A single-row dataframe of creator entity data.
species (str) – Name of the creature (child) table to be retrieved or created.
trait (str) – Name of the creature’s text attribute.
expr (str) – Value of the creature’s text attribute.
- Returns:
creature – A single-row dataframe of creature entity data.
- Return type:
pandas.DataFrame
- close_all_cnxns()¶
Close all database connections.
- Return type:
None.
- close_cnxn(db)¶
Close connections to a database.
- Parameters:
db (str) – Path to the database file.
- Return type:
None.
- delete(cnxn, species, entity_id)¶
Delete an entity from a table by ID.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
species (str) – Name of the table from which to delete.
entity_id (int) – ID of the entity to delete.
- Returns:
rows_deleted – Count of rows affected by DELETE statement.
- Return type:
int
- get_all_tables(cnxn)¶
Get all tables in the instance database.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
- Returns:
tables – Dataframe describing all database tables.
- Return type:
pandas.DataFrame
- get_by_id(cnxn, kind, being_id)¶
Get an entity by its id.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
kind (str) – Name of the table to query.
being_id (int) – Value of the entity’s id.
- Returns:
being – Dataframe of being data.
- Return type:
pandas.DataFrame
- get_by_trait(cnxn, kind, expr)¶
Get being(s) by the text attribute value.
Note that since values of the trait column need not be unique, it is possible that the dataframe returned will contain more than one being.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
kind (str) – Name of the table to query.
expr (str) – Value of the being’s trait (text attribute).
- Returns:
being – Dataframe of one or more beings.
- Return type:
pandas.DataFrame
- get_creator(cnxn, species, creature)¶
Get the creator of a given creature.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
species (str) – Name of the creature (child) table.
creature (pandas.DataFrame) – A single-row dataframe of creature entity data.
- Returns:
genus (str) – Name of the creator (parent) table.
creator (pandas.DataFrame) – A single-row dataframe of creator entity data.
- get_creator_genus(cnxn, species)¶
Get table name of creator (parent) table.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
species (str) – Name of the creature (child) table.
- Raises:
ValueError – If more than one creator table is found, raise a ValueError. Each creature table should have one & only one creator.
- Returns:
genus – Name of the creator (parent) table.
- Return type:
str
- get_creature_species(cnxn, genus)¶
Get types of all creatures with a given creator genus.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
genus (str) – Name of the creator (parent) table.
- Returns:
species – List of creature (child) table names.
- Return type:
list(str)
- get_creatures(cnxn, genus, creator)¶
Get all creatures of a given creator.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
genus (str) – Name of the creator (parent) table.
creator (pandas.DataFrame) – A single-row dataframe of creator entity data.
- Returns:
creatures – List of two-tuples whose first entry is the name of the creature (child) table, & whose second entry is a dataframe of creature data.
- Return type:
list(tuple(str, pandas.DataFrame))
- get_df(cnxn, sql, expected_columns=None, raise_errors=False)¶
Get result of SQL query as a pandas dataframe. In the event of an exception, return an empty dataframe.
- Parameters:
cnxn (sqlite3.Connection) – Connection to the database.
sql (str) – SQL to be executed by pandas.read_sql.
expected_columns (list(str), optional) – List of expected columns. If raise_errors is True & the dataframe columns do not match expected_columns, a ValueError is raised. The default is None.
raise_errors (bool, optional) – Whether to raise exceptions encountered during execution. The default is False.
- Raises:
error – If raise_errors is True, raise any error encountered during execution.
- Returns:
df – A dataframe containing the results of the SQL query.
- Return type:
pandas.DataFrame
- get_or_create(cnxn, tablename, dtype, cols, vals, retry=True)¶
Get entities from an existing table, or create the table & (optionally) insert them.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
tablename (str) – Name of the database table. If the table does not exist, it will be created.
dtype (dict(str, str)) – Dict of table columns & column data types.
cols (list(str)) – Columns to be used in SELECT statement.
vals (list(str)) – Values to be used in SELECT statement.
retry (bool, optional) – If true & SELECT returns an empty result, INSERT the specifies values & try again. The default is True.
- Raises:
ValueError – Raised when no matching rows are found & retry is False.
- Returns:
result – A dataframe of rows matching column & value criteria.
- Return type:
pandas.DataFrame
- get_table_columns(cnxn, tablename)¶
Get columns of a database table.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
tablename (str) – Name of the database table.
- Returns:
columns – Dataframe describing table columns.
- Return type:
pandas.DataFrame
- get_trait(cnxn, kind)¶
Gets the trait (attribute) column of the given kind.
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
kind (str) – Name of the table.
- Raises:
ValueError – If no trait column is identified, or if more than one trait column is identified, raise a ValueError.
- Returns:
trait – Name of the trait column.
- Return type:
str
- open_cnxn(db)¶
Open a connection to a database.
- Parameters:
db (str) – The name of the database.
- Returns:
cnxn – Connection to the database.
- Return type:
sqlite3.Connection
- update(cnxn, tablename, set_cols, set_vals, where_cols, where_vals)¶
Update values in a database table. Executes a SQL statement of the form
- UPDATE
{tablename}
- SET
{set_cols[0]} = {set_vals[0]}, {set_cols[1]} = {set_vals[1]}, …
- WHERE
{where_cols[0]} = {where_vals[0]} AND {where_cols[1]} = {where_vals[1]} AND …
- Parameters:
cnxn (sqlite3.Connection) – A database connection.
tablename (str) – Name of the table to update.
set_cols (list(str)) – List of columns to update.
set_vals (list(any)) – Updated values for columns.
where_cols (list(str)) – List of columns for WHERE condition.
where_vals (list(any)) – List of values for WHERE condition.
- Returns:
rows_updated – Number of rows affected by update statement.
- Return type:
int
indexia.inquiry module¶
Generate SQL for indexia database oprerations.
- class indexia.inquiry.Inquiry¶
Bases:
objectGenerate SQL strings from dynamic inputs.
- create(columns)¶
Get a SQL CREATE TABLE statement.
- Parameters:
tablename (str) – Name of the table to create.
columns (dict(str)) – Dict of columns to add to table. Keys are column names, values are data types.
- Returns:
create – A formatted SQL CREATE TABLE statement.
- Return type:
str
- delete(conditions='')¶
Get a SQL DELETE FROM statement.
- Parameters:
tablename (str) – Name of the table from which to delete.
conditions (str, optional) – Optional WHERE conditions. The default is ‘’.
- Returns:
delete – A formatted SQL DELETE FROM statement.
- Return type:
str
- insert(values, columns=None)¶
GET a SQL INSERT statement.
- Parameters:
tablename (str) – Name of table into which values will be inserted.
values (list(str) or list(tuple(str))) – A list of strings or tuples containing strings. Should be equal-length values representing the values to insert.
- Returns:
insert – A formatted SQL INSERT statement.
- Return type:
str
- select(columns, conditions='')¶
GET a SQL SELECT statement.
- Parameters:
tablename (str) – Name of the table from which to select values.
columns (list(str)) – list of column names to select.
conditions (str, optional) – A SQL-formatted string of conditions. The default is ‘’.
- Returns:
select – A formatted SQL SELECT statement.
- Return type:
str
- update(set_cols, set_values, conditions='')¶
Get a SQL UPDATE statement.
- Parameters:
tablename (str) – Name of the table in which to update rows.
set_cols (list(str)) – List of column names to update.
set_values (list(any)) – List of values with which to update columns. Paired with set_cols such that set_cols[i] = set_values[i].
conditions (str, optional) – A SQL-formatted string of conditions. The default is ‘’.
- Returns:
update – DESCRIPTION.
- Return type:
TYPE
- where(vals, conjunction='AND')¶
Construct WHERE condition from columns & values
- Parameters:
cols (list(str)) – List of column names.
vals (list(any)) – List of values.
conjunction (str, optional) – SQL keyword to use as conjunction between clauses (e.g., AND, OR).
- Returns:
conditions – A SQL-formatted WHERE condition.
- Return type:
str
- class indexia.inquiry.Tabula¶
Bases:
objectDefines columns & data types of indexia tables.
- get_creator_table(trait)¶
Get name & columns of a creator (parent) table.
- Parameters:
genus (str) – Name of the creator (parent) table.
trait (str) – Name of the creator’s text attribute.
- Returns:
creator_table – A tuple whose first entry is the name of the creator table, & whose second is a dict of table columns & data types.
- Return type:
tuple(str, dict)
- get_creature_table(species, trait)¶
Get name & columns of a creature (child) table.
- Parameters:
creator (str) – Name of the creator (parent) table.
name (str) – Name of the creature table.
attribute (str) – Name of the creature’s text attribute.
- Returns:
creature_table – A tuple whose first entry is the name of the creature table, & whose second is a dict of table columns & data types.
- Return type:
tuple(str, dict)
- references(on_column, on_delete='CASCADE', on_update='CASCADE')¶
Generate SQL-formatted REFERENCES clause.
- Parameters:
tablename (str) – Name of the referenced table.
on_column (str) – Name of the referenced column.
on_delete (str, optional) – Behavior of the child entity when the parent entity is deleted. The default is ‘CASCADE’.
on_update (str, optional) – Behavior of the child entity when the parent entity is updated. The default is ‘CASCADE’.
- Returns:
references – A SQL-formatted REFERENCES clause.
- Return type:
str
indexia.schemata module¶
Defines tree & graph representations of indexia data.
- class indexia.schemata.Corpus(db, genus, creators, max_depth=10)¶
Bases:
objectRepresent indexia data as a dataframe.
- assemble()¶
Assemble the corpus of each of the creator entities.
- Returns:
corpus – Dataframe representing all creatures of the instance’s creator entity, up to the distance specified by max_depth.
- Return type:
pandas.DataFrame
- get_trait(species)¶
Gets the trait (attribute) column of the given species.
- Parameters:
species (str) – Name of the creature (child) table.
- Returns:
trait – Name of the trait column.
- Return type:
str
- make_limbs(genus, creator, depth)¶
Moves down the spine to create lists of dataframes representing indexia entity data.
- Parameters:
genus (str) – Name of the creator (parent) table.
creator (pandas.DataFrame) – Single-row dataframe of creator entity data.
depth (int) – Current level in the corpus rendering process. Compared with max_depth to determine whether to proceed.
- Returns:
limbs – List of dataframes representing indexia entity data.
- Return type:
list(pandas.DataFrame)
- make_member(genus, creator, species, creatures)¶
Creates a dataframe of indexia entity data.
- Parameters:
genus (str) – Name of the creator (parent) table.
creator (pandas.DataFrame) – Single-row dataframe of creator entity data.
species (str) – Name of the creature (child) table.
creatures (pandas.DataFrame) – Dataframe of creature entity data.
- Returns:
member – Dataframe describing creature entities, including creator information.
- Return type:
pandas.DataFrame
- to_csv(corpus, file_path, **kwargs)¶
Save an assembled corpus dataframe to a CSV file.
- Parameters:
corpus (pandas.DataFrame) – Dataframe representing indexia data, created by the assemble method of this class.
file_path (str) – Path of the CSV file to be created.
**kwargs (any) – Any keyword arguments accepted by pandas.DataFrame.to_csv.
- Returns:
file_path – Path to the corpus CSV file.
- Return type:
str
- class indexia.schemata.Dendron(db)¶
Bases:
objectRepresent indexia data as an XML tree.
- render_image(genus, creators, root=<Element 'root'>)¶
Render the XML tree.
- Parameters:
genus (str) – Name of the top-level table.
creators (pandas.DataFrame) – One or more rows of the top-level table to render as XML.
root (xml.etree.ElementTree.Element, optional) – Root element of the XML tree, used in iterative calls to this method. It is not typically necessary to supply this argument. The default is xml.etree.ElementTree.Element(‘root’).
- Returns:
image – An XML element tree of indexia data.
- Return type:
xml.etree.ElementTree.ElementTree
- write_image(image, file_path=None, open_browser=True)¶
Write the XML image of the Dendron instance to an XML file, & optionally open in the browser.
- Parameters:
image (xml.etree.ElementTree.ElementTree) – Image of the current Dendron instance as an XML tree.
file_path (str, optional) – Path where the XML file will be created. If None, the default (dendron.xml) is used. The default is None.
open_browser (bool, optional) – If True, open the XML file in the default browser. The default is True.
- Returns:
file_path – Absolute path to the XML image file.
- Return type:
str
- class indexia.schemata.Diktua(corpus, as_nodes, as_edges, self_edges=False)¶
Bases:
objectRepresent indexia data as a network graph.
- get_graph_elements()¶
Get graph nodes & edges.
- Returns:
nodes (list) – List of graph nodes.
edges (list) – List of tuples representing graph edges.
- get_node_info()¶
Count node edges & assign titles.
Edge counts are used to determine node size when the graph is displayed; titles are shown when hovering over nodes in the display.
- Returns:
node_edges (dict) – Keys are graph nodes; values are counts of edges on each node.
node_titles (dict) – Keys are graph nodes; values are string titles assigned to nodes.
- get_node_sizes(node_edges, min_size, max_size)¶
Calculate node size based on number of edges.
Node sizes are scaled to the interval [min_size, max_size].
- Parameters:
node_edges (dict) – Dictionary of graph nodes & edge counts.
min_size (int) – Minimum node size.
max_size (int) – Maximum node size.
- Returns:
node_sizes – Keys are graph nodes; values are node sizes.
- Return type:
dict
- make_undirected_graph()¶
Create an undirected network graph from the corpus attribute of the instance.
- Returns:
G – And undirected network graph of instance data.
- Return type:
networkx.Graph
- plot(plot_path=None, open_browser=False)¶
Create a plot of the instance’s graph.
- Parameters:
plot_path (str or None, optional) – If supplied, plot will be written to an HTML file at plot_path. The default is None.
open_browser (bool, optional) – Whether to open the plot in the browser. The default is False.
- Returns:
plot (pyvis.network.Network) – A plot of the instance’s network graph.
plot_path (str or None) – If plot_path is set, returns the path of the output HTML file. Otherwise None.
- style_nodes(min_size=7, max_size=49)¶
Set size & title attributes of graph nodes.
- Parameters:
min_size (int, optional) – Minimum node size. The default is 7.
max_size (int, optional) – Maximum node size. The default is 49.
- Returns:
Network graph with node attributes set.
- Return type:
networkx.Graph
- to_csv(file_path, **kwargs)¶
Save the edges of the instance’s graph to a CSV file with columns ‘source’ & ‘target’.
- Parameters:
file_path (str) – Path of the CSV file to be created.
**kwargs (any) – Any keyword arguments accepted by pandas.DataFrame.to_csv.
- Returns:
file_path – Path to the output CSV file.
- Return type:
str
- class indexia.schemata.ScalaNaturae(db)¶
Bases:
objectAscend & descend the hierarchy of indexia data.
- climb(kind, being, direction)¶
Climb one rung in either direction (up or down).
- Parameters:
kind (str) – Name of the starting table.
being (pandas.DataFrame) – Dataframe of creator or creature entities. If the dataframe contains more than one row, only results for the first row will be returned.
direction (str) – Direction to climb. Must be either ‘up’ or ‘down’.
- Raises:
ValueError – If direction is not either ‘up’ or ‘down’, rasise a ValueError.
- Returns:
next_rung – List of tuples of the form (kind, beings), where kind is the name of a creator or creature table, & beings is a dataframe of creator or creature entity data.
- Return type:
list(tuple)
- downward(genus, creator)¶
Climb down one rung.
- Parameters:
genus (str) – Name of the starting creator table.
creator (pandas.DataFrame) – A single-row dataframe of creator entity data.
- Returns:
next_rung – List of tuples of the form (species, creature), where species is the name of the creature table & creature is a dataframe of creature entity data.
- Return type:
list(tuple)
- upward(species, creature)¶
Climb up one rung.
- Parameters:
species (str) – Name of the starting creature table.
creature (pandas.DataFrame) – A single-row dataframe of creature entity data.
- Returns:
next_rung – List containing one tuple of the form (genus, creator), where genus is the name of the creator table & creator is a single-row dataframe of creator entity data.
- Return type:
list(tuple)