indexia package

indexia.eidola module

Make sample creators & creatures.

class indexia.eidola.Maker(test_db, species_per_genus, num_beings, trait)

Bases: object

Fashion any number of creators & creatures for testing.

get()

Get test data.

Returns:

  • fathers (list(pandas.DataFrame)) – List containing a single dataframe of creator data.

  • sons (list(pandas.DataFrame)) – List containing species_per_genus dataframes of creature data.

  • grandsons (list(pandas.DataFrame)) – List containing (species_per_genus)^2 dataframes of creature data.

  • great_grandsons (list(pandas.DataFrame)) – List containing (species_per_genus)^3 dataframes of creature data.

make()

Make test data.

Makes a single creator table & 3 generations of creature tables. Each generation has species_per_genus tables, with num_beings creatures in each table. The creators & creatures all have the attribute trait.

Returns:

  • fathers (list(pandas.DataFrame)) – List containing a single dataframe of creator data.

  • sons (list(pandas.DataFrame)) – List containing species_per_genus dataframes of creature data.

  • grandsons (list(pandas.DataFrame)) – List containing (species_per_genus)^2 dataframes of creature data.

  • great_grandsons (list(pandas.DataFrame)) – List containing (species_per_genus)^3 dataframes of creature data.

make_creators(ix, cnxn, genus)

Make creator beings.

The number of beings specified by num_beings will be created in a table named genus.

Parameters:
  • ix (indexia.indexia.Indexia) – An Indexia instance.

  • cnxn (sqlite3.Connection) – A database connection.

  • genus (str) – Name of the creator (parent) table.

Returns:

creators – Dataframe of creator data.

Return type:

list(pandas.DataFrame)

make_creatures(ix, cnxn, genus, species)

Make sample creatures with a given genus & species.

A creature table named species is created, & num_beings creature records are added to the table. Each creature has the attribute trait.

Parameters:
  • ix (indexia.indexia.Indexia) – An Indexia instance.

  • cnxn (sqlite3.Connection) – A database connection.

  • genus (str) – Name of the creator (parent) table.

  • species (str) – Name of the creature (child) table.

Returns:

creatures – Dataframe of creature data.

Return type:

pandas.DataFrame

make_species(ix, cnxn, genus, species_prefix)

Make one or more species of a given genus.

The number of species created for in the genus is given by species_per_genus. For each species created, num_beings creature records are added to the species, each having the attribute trait.

Parameters:
  • ix (indexia.indexia.Indexia) – An Indexia instance.

  • cnxn (sqlite3.Connection) – A database connection.

  • genus (str) – Name of the creator (parent) table.

  • species_prefix (str) – Prefix of the creature (child) table names.

  • species (str) – Name of the creature (child) table.

Returns:

species – List of dataframes containing creature data.

Return type:

list(pandas.DataFrame)

class indexia.eidola.Templates(db)

Bases: object

Create template indexia objects.

build_template(template_name)

Create objects for the given template.

Parameters:

template_name (str) – Name of the template to build.

Raises:

ValueError – If template_name is not a valid template, raise a ValueError.

Returns:

objects – List of tuples containing table names & object data.

Return type:

list(tuple(string, pandas.DataFrame))

show_templates()

Show available templates

Returns:

templates – Dictionary of available template names & table structures.

Return type:

dict

indexia.indexia module

Defines core operations on indexia objects.

class indexia.indexia.Indexia(db=None)

Bases: object

Core class for creating, modifying, & retrieving indexia objects.

add_creator(cnxn, genus, trait, expr)

Get or create a creator entity.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • genus (str) – Name of the creator (parent) table to be retrieved or created.

  • trait (str) – Name of the creator’s text attribute.

  • expr (str) – Value of the creator’s text attribute.

Returns:

creator – A single-row dataframe of creator entity data.

Return type:

pandas.DataFrame

add_creature(cnxn, genus, creator, species, trait, expr)

Get or create a creature of a given creator.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • genus (str) – Name of the creator (parent) table.

  • creator (pandas.DataFrame) – A single-row dataframe of creator entity data.

  • species (str) – Name of the creature (child) table to be retrieved or created.

  • trait (str) – Name of the creature’s text attribute.

  • expr (str) – Value of the creature’s text attribute.

Returns:

creature – A single-row dataframe of creature entity data.

Return type:

pandas.DataFrame

close_all_cnxns()

Close all database connections.

Return type:

None.

close_cnxn(db)

Close connections to a database.

Parameters:

db (str) – Path to the database file.

Return type:

None.

delete(cnxn, species, entity_id)

Delete an entity from a table by ID.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • species (str) – Name of the table from which to delete.

  • entity_id (int) – ID of the entity to delete.

Returns:

rows_deleted – Count of rows affected by DELETE statement.

Return type:

int

get_all_tables(cnxn)

Get all tables in the instance database.

Parameters:

cnxn (sqlite3.Connection) – A database connection.

Returns:

tables – Dataframe describing all database tables.

Return type:

pandas.DataFrame

get_by_id(cnxn, kind, being_id)

Get an entity by its id.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • kind (str) – Name of the table to query.

  • being_id (int) – Value of the entity’s id.

Returns:

being – Dataframe of being data.

Return type:

pandas.DataFrame

get_by_trait(cnxn, kind, expr)

Get being(s) by the text attribute value.

Note that since values of the trait column need not be unique, it is possible that the dataframe returned will contain more than one being.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • kind (str) – Name of the table to query.

  • expr (str) – Value of the being’s trait (text attribute).

Returns:

being – Dataframe of one or more beings.

Return type:

pandas.DataFrame

get_creator(cnxn, species, creature)

Get the creator of a given creature.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • species (str) – Name of the creature (child) table.

  • creature (pandas.DataFrame) – A single-row dataframe of creature entity data.

Returns:

  • genus (str) – Name of the creator (parent) table.

  • creator (pandas.DataFrame) – A single-row dataframe of creator entity data.

get_creator_genus(cnxn, species)

Get table name of creator (parent) table.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • species (str) – Name of the creature (child) table.

Raises:

ValueError – If more than one creator table is found, raise a ValueError. Each creature table should have one & only one creator.

Returns:

genus – Name of the creator (parent) table.

Return type:

str

get_creature_species(cnxn, genus)

Get types of all creatures with a given creator genus.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • genus (str) – Name of the creator (parent) table.

Returns:

species – List of creature (child) table names.

Return type:

list(str)

get_creatures(cnxn, genus, creator)

Get all creatures of a given creator.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • genus (str) – Name of the creator (parent) table.

  • creator (pandas.DataFrame) – A single-row dataframe of creator entity data.

Returns:

creatures – List of two-tuples whose first entry is the name of the creature (child) table, & whose second entry is a dataframe of creature data.

Return type:

list(tuple(str, pandas.DataFrame))

get_df(cnxn, sql, expected_columns=None, raise_errors=False)

Get result of SQL query as a pandas dataframe. In the event of an exception, return an empty dataframe.

Parameters:
  • cnxn (sqlite3.Connection) – Connection to the database.

  • sql (str) – SQL to be executed by pandas.read_sql.

  • expected_columns (list(str), optional) – List of expected columns. If raise_errors is True & the dataframe columns do not match expected_columns, a ValueError is raised. The default is None.

  • raise_errors (bool, optional) – Whether to raise exceptions encountered during execution. The default is False.

Raises:

error – If raise_errors is True, raise any error encountered during execution.

Returns:

df – A dataframe containing the results of the SQL query.

Return type:

pandas.DataFrame

get_or_create(cnxn, tablename, dtype, cols, vals, retry=True)

Get entities from an existing table, or create the table & (optionally) insert them.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • tablename (str) – Name of the database table. If the table does not exist, it will be created.

  • dtype (dict(str, str)) – Dict of table columns & column data types.

  • cols (list(str)) – Columns to be used in SELECT statement.

  • vals (list(str)) – Values to be used in SELECT statement.

  • retry (bool, optional) – If true & SELECT returns an empty result, INSERT the specifies values & try again. The default is True.

Raises:

ValueError – Raised when no matching rows are found & retry is False.

Returns:

result – A dataframe of rows matching column & value criteria.

Return type:

pandas.DataFrame

get_table_columns(cnxn, tablename)

Get columns of a database table.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • tablename (str) – Name of the database table.

Returns:

columns – Dataframe describing table columns.

Return type:

pandas.DataFrame

get_trait(cnxn, kind)

Gets the trait (attribute) column of the given kind.

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • kind (str) – Name of the table.

Raises:

ValueError – If no trait column is identified, or if more than one trait column is identified, raise a ValueError.

Returns:

trait – Name of the trait column.

Return type:

str

open_cnxn(db)

Open a connection to a database.

Parameters:

db (str) – The name of the database.

Returns:

cnxn – Connection to the database.

Return type:

sqlite3.Connection

update(cnxn, tablename, set_cols, set_vals, where_cols, where_vals)

Update values in a database table. Executes a SQL statement of the form

UPDATE

{tablename}

SET

{set_cols[0]} = {set_vals[0]}, {set_cols[1]} = {set_vals[1]}, …

WHERE

{where_cols[0]} = {where_vals[0]} AND {where_cols[1]} = {where_vals[1]} AND …

Parameters:
  • cnxn (sqlite3.Connection) – A database connection.

  • tablename (str) – Name of the table to update.

  • set_cols (list(str)) – List of columns to update.

  • set_vals (list(any)) – Updated values for columns.

  • where_cols (list(str)) – List of columns for WHERE condition.

  • where_vals (list(any)) – List of values for WHERE condition.

Returns:

rows_updated – Number of rows affected by update statement.

Return type:

int

indexia.inquiry module

Generate SQL for indexia database oprerations.

class indexia.inquiry.Inquiry

Bases: object

Generate SQL strings from dynamic inputs.

create(columns)

Get a SQL CREATE TABLE statement.

Parameters:
  • tablename (str) – Name of the table to create.

  • columns (dict(str)) – Dict of columns to add to table. Keys are column names, values are data types.

Returns:

create – A formatted SQL CREATE TABLE statement.

Return type:

str

delete(conditions='')

Get a SQL DELETE FROM statement.

Parameters:
  • tablename (str) – Name of the table from which to delete.

  • conditions (str, optional) – Optional WHERE conditions. The default is ‘’.

Returns:

delete – A formatted SQL DELETE FROM statement.

Return type:

str

insert(values, columns=None)

GET a SQL INSERT statement.

Parameters:
  • tablename (str) – Name of table into which values will be inserted.

  • values (list(str) or list(tuple(str))) – A list of strings or tuples containing strings. Should be equal-length values representing the values to insert.

Returns:

insert – A formatted SQL INSERT statement.

Return type:

str

select(columns, conditions='')

GET a SQL SELECT statement.

Parameters:
  • tablename (str) – Name of the table from which to select values.

  • columns (list(str)) – list of column names to select.

  • conditions (str, optional) – A SQL-formatted string of conditions. The default is ‘’.

Returns:

select – A formatted SQL SELECT statement.

Return type:

str

update(set_cols, set_values, conditions='')

Get a SQL UPDATE statement.

Parameters:
  • tablename (str) – Name of the table in which to update rows.

  • set_cols (list(str)) – List of column names to update.

  • set_values (list(any)) – List of values with which to update columns. Paired with set_cols such that set_cols[i] = set_values[i].

  • conditions (str, optional) – A SQL-formatted string of conditions. The default is ‘’.

Returns:

update – DESCRIPTION.

Return type:

TYPE

where(vals, conjunction='AND')

Construct WHERE condition from columns & values

Parameters:
  • cols (list(str)) – List of column names.

  • vals (list(any)) – List of values.

  • conjunction (str, optional) – SQL keyword to use as conjunction between clauses (e.g., AND, OR).

Returns:

conditions – A SQL-formatted WHERE condition.

Return type:

str

class indexia.inquiry.Tabula

Bases: object

Defines columns & data types of indexia tables.

get_creator_table(trait)

Get name & columns of a creator (parent) table.

Parameters:
  • genus (str) – Name of the creator (parent) table.

  • trait (str) – Name of the creator’s text attribute.

Returns:

creator_table – A tuple whose first entry is the name of the creator table, & whose second is a dict of table columns & data types.

Return type:

tuple(str, dict)

get_creature_table(species, trait)

Get name & columns of a creature (child) table.

Parameters:
  • creator (str) – Name of the creator (parent) table.

  • name (str) – Name of the creature table.

  • attribute (str) – Name of the creature’s text attribute.

Returns:

creature_table – A tuple whose first entry is the name of the creature table, & whose second is a dict of table columns & data types.

Return type:

tuple(str, dict)

references(on_column, on_delete='CASCADE', on_update='CASCADE')

Generate SQL-formatted REFERENCES clause.

Parameters:
  • tablename (str) – Name of the referenced table.

  • on_column (str) – Name of the referenced column.

  • on_delete (str, optional) – Behavior of the child entity when the parent entity is deleted. The default is ‘CASCADE’.

  • on_update (str, optional) – Behavior of the child entity when the parent entity is updated. The default is ‘CASCADE’.

Returns:

references – A SQL-formatted REFERENCES clause.

Return type:

str

indexia.schemata module

Defines tree & graph representations of indexia data.

class indexia.schemata.Corpus(db, genus, creators, max_depth=10)

Bases: object

Represent indexia data as a dataframe.

assemble()

Assemble the corpus of each of the creator entities.

Returns:

corpus – Dataframe representing all creatures of the instance’s creator entity, up to the distance specified by max_depth.

Return type:

pandas.DataFrame

get_trait(species)

Gets the trait (attribute) column of the given species.

Parameters:

species (str) – Name of the creature (child) table.

Returns:

trait – Name of the trait column.

Return type:

str

make_limbs(genus, creator, depth)

Moves down the spine to create lists of dataframes representing indexia entity data.

Parameters:
  • genus (str) – Name of the creator (parent) table.

  • creator (pandas.DataFrame) – Single-row dataframe of creator entity data.

  • depth (int) – Current level in the corpus rendering process. Compared with max_depth to determine whether to proceed.

Returns:

limbs – List of dataframes representing indexia entity data.

Return type:

list(pandas.DataFrame)

make_member(genus, creator, species, creatures)

Creates a dataframe of indexia entity data.

Parameters:
  • genus (str) – Name of the creator (parent) table.

  • creator (pandas.DataFrame) – Single-row dataframe of creator entity data.

  • species (str) – Name of the creature (child) table.

  • creatures (pandas.DataFrame) – Dataframe of creature entity data.

Returns:

member – Dataframe describing creature entities, including creator information.

Return type:

pandas.DataFrame

to_csv(corpus, file_path, **kwargs)

Save an assembled corpus dataframe to a CSV file.

Parameters:
  • corpus (pandas.DataFrame) – Dataframe representing indexia data, created by the assemble method of this class.

  • file_path (str) – Path of the CSV file to be created.

  • **kwargs (any) – Any keyword arguments accepted by pandas.DataFrame.to_csv.

Returns:

file_path – Path to the corpus CSV file.

Return type:

str

class indexia.schemata.Dendron(db)

Bases: object

Represent indexia data as an XML tree.

render_image(genus, creators, root=<Element 'root'>)

Render the XML tree.

Parameters:
  • genus (str) – Name of the top-level table.

  • creators (pandas.DataFrame) – One or more rows of the top-level table to render as XML.

  • root (xml.etree.ElementTree.Element, optional) – Root element of the XML tree, used in iterative calls to this method. It is not typically necessary to supply this argument. The default is xml.etree.ElementTree.Element(‘root’).

Returns:

image – An XML element tree of indexia data.

Return type:

xml.etree.ElementTree.ElementTree

write_image(image, file_path=None, open_browser=True)

Write the XML image of the Dendron instance to an XML file, & optionally open in the browser.

Parameters:
  • image (xml.etree.ElementTree.ElementTree) – Image of the current Dendron instance as an XML tree.

  • file_path (str, optional) – Path where the XML file will be created. If None, the default (dendron.xml) is used. The default is None.

  • open_browser (bool, optional) – If True, open the XML file in the default browser. The default is True.

Returns:

file_path – Absolute path to the XML image file.

Return type:

str

class indexia.schemata.Diktua(corpus, as_nodes, as_edges, self_edges=False)

Bases: object

Represent indexia data as a network graph.

get_graph_elements()

Get graph nodes & edges.

Returns:

  • nodes (list) – List of graph nodes.

  • edges (list) – List of tuples representing graph edges.

get_node_info()

Count node edges & assign titles.

Edge counts are used to determine node size when the graph is displayed; titles are shown when hovering over nodes in the display.

Returns:

  • node_edges (dict) – Keys are graph nodes; values are counts of edges on each node.

  • node_titles (dict) – Keys are graph nodes; values are string titles assigned to nodes.

get_node_sizes(node_edges, min_size, max_size)

Calculate node size based on number of edges.

Node sizes are scaled to the interval [min_size, max_size].

Parameters:
  • node_edges (dict) – Dictionary of graph nodes & edge counts.

  • min_size (int) – Minimum node size.

  • max_size (int) – Maximum node size.

Returns:

node_sizes – Keys are graph nodes; values are node sizes.

Return type:

dict

make_undirected_graph()

Create an undirected network graph from the corpus attribute of the instance.

Returns:

G – And undirected network graph of instance data.

Return type:

networkx.Graph

plot(plot_path=None, open_browser=False)

Create a plot of the instance’s graph.

Parameters:
  • plot_path (str or None, optional) – If supplied, plot will be written to an HTML file at plot_path. The default is None.

  • open_browser (bool, optional) – Whether to open the plot in the browser. The default is False.

Returns:

  • plot (pyvis.network.Network) – A plot of the instance’s network graph.

  • plot_path (str or None) – If plot_path is set, returns the path of the output HTML file. Otherwise None.

style_nodes(min_size=7, max_size=49)

Set size & title attributes of graph nodes.

Parameters:
  • min_size (int, optional) – Minimum node size. The default is 7.

  • max_size (int, optional) – Maximum node size. The default is 49.

Returns:

Network graph with node attributes set.

Return type:

networkx.Graph

to_csv(file_path, **kwargs)

Save the edges of the instance’s graph to a CSV file with columns ‘source’ & ‘target’.

Parameters:
  • file_path (str) – Path of the CSV file to be created.

  • **kwargs (any) – Any keyword arguments accepted by pandas.DataFrame.to_csv.

Returns:

file_path – Path to the output CSV file.

Return type:

str

class indexia.schemata.ScalaNaturae(db)

Bases: object

Ascend & descend the hierarchy of indexia data.

climb(kind, being, direction)

Climb one rung in either direction (up or down).

Parameters:
  • kind (str) – Name of the starting table.

  • being (pandas.DataFrame) – Dataframe of creator or creature entities. If the dataframe contains more than one row, only results for the first row will be returned.

  • direction (str) – Direction to climb. Must be either ‘up’ or ‘down’.

Raises:

ValueError – If direction is not either ‘up’ or ‘down’, rasise a ValueError.

Returns:

next_rung – List of tuples of the form (kind, beings), where kind is the name of a creator or creature table, & beings is a dataframe of creator or creature entity data.

Return type:

list(tuple)

downward(genus, creator)

Climb down one rung.

Parameters:
  • genus (str) – Name of the starting creator table.

  • creator (pandas.DataFrame) – A single-row dataframe of creator entity data.

Returns:

next_rung – List of tuples of the form (species, creature), where species is the name of the creature table & creature is a dataframe of creature entity data.

Return type:

list(tuple)

upward(species, creature)

Climb up one rung.

Parameters:
  • species (str) – Name of the starting creature table.

  • creature (pandas.DataFrame) – A single-row dataframe of creature entity data.

Returns:

next_rung – List containing one tuple of the form (genus, creator), where genus is the name of the creator table & creator is a single-row dataframe of creator entity data.

Return type:

list(tuple)