This class represents heterogeneous datasets in an HDF5 file.
Tables are leaves (see the Leaf class in The Leaf class) whose data consists of a unidimensional sequence of rows, where each row contains one or more fields. Fields have an associated unique name and position, with the first field having position 0. All rows have the same fields, which are arranged in columns.
Fields can have any type supported by the Col class (see The Col class and its descendants) and its descendants, which support multidimensional data. Moreover, a field can be nested (to an arbitrary depth), meaning that it includes further fields inside. A field named x inside a nested field a in a table can be accessed as the field a/x (its path name) from the table.
The structure of a table is declared by its description, which is made available in the Table.description attribute (see Table).
This class provides new methods to read, write and search table data efficiently. It also provides special Python methods to allow accessing the table as a normal sequence or array (with extended slicing supported).
PyTables supports in-kernel searches working simultaneously on several columns using complex conditions. These are faster than selections using Python expressions. See the Table.where() method for more information on in-kernel searches.
Non-nested columns can be indexed. Searching an indexed column can be several times faster than searching a non-nested one. Search methods automatically take advantage of indexing where available.
When iterating a table, an object from the Row (see The Row class) class is used. This object allows to read and write data one row at a time, as well as to perform queries which are not supported by in-kernel syntax (at a much lower speed, of course).
Objects of this class support access to individual columns via natural naming through the Table.cols accessor. Nested columns are mapped to Cols instances, and non-nested ones to Column instances. See the Column class in The Column class for examples of this feature.
Parameters : | parentnode :
name : str
description :
title :
filters : Filters
expectedrows :
chunkshape :
byteorder :
|
---|
Notes
The instance variables below are provided in addition to those in Leaf (see The Leaf class). Please note that there are several col* dictionaries to ease retrieving information about a column directly by its path name, avoiding the need to walk through Table.description or Table.cols.
Table attributes
Maps the name of a column to its Col description (see The Col class and its descendants).
Maps the name of a column to its default value.
Maps the name of a column to its NumPy data type.
Is the column which name is used as a key indexed?
Maps the name of a column to its Column (see The Column class) or Cols (see The Cols class) instance.
A list containing the names of top-level columns in the table.
A list containing the pathnames of bottom-level columns in the table.
These are the leaf columns obtained when walking the table description left-to-right, bottom-first. Columns inside a nested column have slashes (/) separating name components in their pathname.
A Cols instance that provides natural naming access to non-nested (Column, see The Column class) and nested (Cols, see The Cols class) columns.
Maps the name of a column to its PyTables data type.
A Description instance (see The Description class) reflecting the structure of the table.
The index of the enlargeable dimension (always 0 for tables).
Does this table have any indexed columns?
The current number of rows in the table.
Automatically keep column indexes up to date?
Setting this value states whether existing indexes should be automatically updated after an append operation or recomputed after an index-invalidating operation (i.e. removal and modification of rows). The default is true.
This value gets into effect whenever a column is altered. If you don’t have automatic indexing activated and you want to do an immediate update use Table.flush_rows_to_index(); for immediate reindexing of invalidated indexes, use Table.reindex_dirty().
This value is persistent.
Changed in version 3.0: The autoIndex property has been renamed into autoindex.
A dictionary with the indexes of the indexed columns.
List of pathnames of indexed columns in the table.
The associated Row instance (see The Row class).
The size in bytes of each row in the table.
Get a column from the table.
If a column called name exists in the table, it is read and returned as a NumPy object. If it does not exist, a KeyError is raised.
Examples
narray = table.col('var2')
That statement is equivalent to:
narray = table.read(field='var2')
Here you can see how this method can be used as a shorthand for the Table.read() method.
Iterate over the table using a Row instance.
If a range is not supplied, all the rows in the table are iterated upon - you can also use the Table.__iter__() special method for that purpose. If you want to iterate over a given range of rows in the table, you may use the start, stop and step parameters.
Warning
When in the middle of a table row iterator, you should not use methods that can change the number of rows in the table (like Table.append() or Table.remove_rows()) or unexpected errors will happen.
See also
Notes
This iterator can be nested (see Table.where() for an example).
Changed in version 3.0: If the start parameter is provided and stop is None then the table is iterated from start to the last line. In PyTables < 3.0 only one element was returned.
Examples
result = [ row['var2'] for row in table.iterrows(step=5)
if row['var1'] <= 20 ]
Iterate over a sequence of row coordinates.
Notes
This iterator can be nested (see Table.where() for an example).
Iterate table data following the order of the index of sortby column.
The sortby column must have associated a full index. If you want to ensure a fully sorted order, the index must be a CSI one. You may want to use the checkCSI argument in order to explicitly check for the existence of a CSI index.
The meaning of the start, stop and step arguments is the same as in Table.read().
Changed in version 3.0: If the start parameter is provided and stop is None then the table is iterated from start to the last line. In PyTables < 3.0 only one element was returned.
Get data in the table as a (record) array.
The start, stop and step parameters can be used to select only a range of rows in the table. Their meanings are the same as in the built-in Python slices.
If field is supplied only the named column will be selected. If the column is not nested, an array of the current flavor will be returned; if it is, a structured array will be used instead. If no field is specified, all the columns will be returned in a structured array of the current flavor.
Columns under a nested column can be specified in the field parameter by using a slash character (/) as a separator (e.g. ‘position/x’).
The out parameter may be used to specify a NumPy array to receive the output data. Note that the array must have the same size as the data selected with the other parameters. Note that the array’s datatype is not checked and no type casting is performed, so if it does not match the datatype on disk, the output will not be correct.
When specifying a single nested column with the field parameter, and supplying an output buffer with the out parameter, the output buffer must contain all columns in the table. The data in all columns will be read into the output buffer. However, only the specified nested column will be returned from the method call.
When data is read from disk in NumPy format, the output will be in the current system’s byteorder, regardless of how it is stored on disk. If the out parameter is specified, the output array also must be in the current system’s byteorder.
Changed in version 3.0: Added the out parameter. Also the start, stop and step parameters now behave like in slice.
Examples
Reading the entire table:
t.read()
Reading record n. 6:
t.read(6, 7)
Reading from record n. 6 to the end of the table:
t.read(6)
Get a set of rows given their indexes as a (record) array.
This method works much like the Table.read() method, but it uses a sequence (coords) of row indexes to select the wanted columns, instead of a column range.
The selected rows are returned in an array or structured array of the current flavor.
Read table data following the order of the index of sortby column.
The sortby column must have associated a full index. If you want to ensure a fully sorted order, the index must be a CSI one. You may want to use the checkCSI argument in order to explicitly check for the existence of a CSI index.
If field is supplied only the named column will be selected. If the column is not nested, an array of the current flavor will be returned; if it is, a structured array will be used instead. If no field is specified, all the columns will be returned in a structured array of the current flavor.
The meaning of the start, stop and step arguments is the same as in Table.read().
Changed in version 3.0.
The start, stop and step parameters now behave like in slice.
Get a row or a range of rows from the table.
If key argument is an integer, the corresponding table row is returned as a record of the current flavor. If key is a slice, the range of rows determined by it is returned as a structured array of the current flavor.
In addition, NumPy-style point selections are supported. In particular, if key is a list of row coordinates, the set of rows determined by it is returned. Furthermore, if key is an array of boolean values, only the coordinates where key is True are returned. Note that for the latter to work it is necessary that key list would contain exactly as many rows as the table has.
Examples
record = table[4]
recarray = table[4:1000:2]
recarray = table[[4,1000]] # only retrieves rows 4 and 1000
recarray = table[[True, False, ..., True]]
Those statements are equivalent to:
record = table.read(start=4)[0]
recarray = table.read(start=4, stop=1000, step=2)
recarray = table.read_coordinates([4,1000])
recarray = table.read_coordinates([True, False, ..., True])
Here, you can see how indexing can be used as a shorthand for the Table.read() and Table.read_coordinates() methods.
Iterate over the table using a Row instance.
This is equivalent to calling Table.iterrows() with default arguments, i.e. it iterates over all the rows in the table.
See also
Notes
This iterator can be nested (see Table.where() for an example).
Examples
result = [ row['var2'] for row in table if row['var1'] <= 20 ]
Which is equivalent to:
result = [ row['var2'] for row in table.iterrows()
if row['var1'] <= 20 ]
Append a sequence of rows to the end of the table.
The rows argument may be any object which can be converted to a structured array compliant with the table structure (otherwise, a ValueError is raised). This includes NumPy structured arrays, lists of tuples or array records, and a string or Python buffer.
Examples
from tables import *
class Particle(IsDescription):
name = StringCol(16, pos=1) # 16-character String
lati = IntCol(pos=2) # integer
longi = IntCol(pos=3) # integer
pressure = Float32Col(pos=4) # float (single-precision)
temperature = FloatCol(pos=5) # double (double-precision)
fileh = open_file('test4.h5', mode='w')
table = fileh.create_table(fileh.root, 'table', Particle,
"A table")
# Append several rows in only one call
table.append([("Particle: 10", 10, 0, 10 * 10, 10**2),
("Particle: 11", 11, -1, 11 * 11, 11**2),
("Particle: 12", 12, -2, 12 * 12, 12**2)])
fileh.close()
Modify one single column in the row slice [start:stop:step].
The colname argument specifies the name of the column in the table to be modified with the data given in column. This method returns the number of rows modified. Should the modification exceed the length of the table, an IndexError is raised before changing data.
The column argument may be any object which can be converted to a (record) array compliant with the structure of the column to be modified (otherwise, a ValueError is raised). This includes NumPy (record) arrays, lists of scalars, tuples or array records, and a string or Python buffer.
Modify a series of columns in the row slice [start:stop:step].
The names argument specifies the names of the columns in the table to be modified with the data given in columns. This method returns the number of rows modified. Should the modification exceed the length of the table, an IndexError is raised before changing data.
The columns argument may be any object which can be converted to a structured array compliant with the structure of the columns to be modified (otherwise, a ValueError is raised). This includes NumPy structured arrays, lists of tuples or array records, and a string or Python buffer.
Modify a series of rows in positions specified in coords
The values in the selected rows will be modified with the data given in rows. This method returns the number of rows modified.
The possible values for the rows argument are the same as in Table.append().
Modify a series of rows in the slice [start:stop:step].
The values in the selected rows will be modified with the data given in rows. This method returns the number of rows modified. Should the modification exceed the length of the table, an IndexError is raised before changing data.
The possible values for the rows argument are the same as in Table.append().
Remove a range of rows in the table.
Changed in version 3.0.
The start, stop and step parameters now behave like in slice.
See also
remove_row()
Parameters : | start : int
stop : int
step : int
|
---|
Examples
Removing rows from 5 to 10 (excluded):
t.remove_rows(5, 10)
Removing all rows starting drom the 10th:
t.remove_rows(10)
Removing the 6th row:
t.remove_rows(6, 7)
Note
removing a single row can be done using the specific remove_row() method.
Set a row or a range of rows in the table.
It takes different actions depending on the type of the key parameter: if it is an integer, the corresponding table row is set to value (a record or sequence capable of being converted to the table structure). If key is a slice, the row slice determined by it is set to value (a record array or sequence capable of being converted to the table structure).
In addition, NumPy-style point selections are supported. In particular, if key is a list of row coordinates, the set of rows determined by it is set to value. Furthermore, if key is an array of boolean values, only the coordinates where key is True are set to values from value. Note that for the latter to work it is necessary that key list would contain exactly as many rows as the table has.
Examples
# Modify just one existing row
table[2] = [456,'db2',1.2]
# Modify two existing rows
rows = numpy.rec.array([[457,'db1',1.2],[6,'de2',1.3]],
formats='i4,a3,f8')
table[1:30:2] = rows # modify a table slice
table[[1,3]] = rows # only modifies rows 1 and 3
table[[True,False,True]] = rows # only modifies rows 0 and 2
Which is equivalent to:
table.modify_rows(start=2, rows=[456,'db2',1.2])
rows = numpy.rec.array([[457,'db1',1.2],[6,'de2',1.3]],
formats='i4,a3,f8')
table.modify_rows(start=1, stop=3, step=2, rows=rows)
table.modify_coordinates([1,3,2], rows)
table.modify_coordinates([True, False, True], rows)
Here, you can see how indexing can be used as a shorthand for the Table.modify_rows() and Table.modify_coordinates() methods.
Get the row coordinates fulfilling the given condition.
The coordinates are returned as a list of the current flavor. sort means that you want to retrieve the coordinates ordered. The default is to not sort them.
The meaning of the other arguments is the same as in the Table.where() method.
Read table data fulfilling the given condition.
This method is similar to Table.read(), having their common arguments and return values the same meanings. However, only the rows fulfilling the condition are included in the result.
The meaning of the other arguments is the same as in the Table.where() method.
Iterate over values fulfilling a condition.
This method returns a Row iterator (see The Row class) which only selects rows in the table that satisfy the given condition (an expression-like string).
The condvars mapping may be used to define the variable names appearing in the condition. condvars should consist of identifier-like strings pointing to Column (see The Column class) instances of this table, or to other values (which will be converted to arrays). A default set of condition variables is provided where each top-level, non-nested column with an identifier-like name appears. Variables in condvars override the default ones.
When condvars is not provided or None, the current local and global namespace is sought instead of condvars. The previous mechanism is mostly intended for interactive usage. To disable it, just specify a (maybe empty) mapping as condvars.
If a range is supplied (by setting some of the start, stop or step parameters), only the rows in that range and fulfilling the condition are used. The meaning of the start, stop and step parameters is the same as for Python slices.
When possible, indexed columns participating in the condition will be used to speed up the search. It is recommended that you place the indexed columns as left and out in the condition as possible. Anyway, this method has always better performance than regular Python selections on the table.
You can mix this method with regular Python selections in order to support even more complex queries. It is strongly recommended that you pass the most restrictive condition as the parameter to this method if you want to achieve maximum performance.
Warning
When in the middle of a table row iterator, you should not use methods that can change the number of rows in the table (like Table.append() or Table.remove_rows()) or unexpected errors will happen.
Examples
>>> passvalues = [ row['col3'] for row in
... table.where('(col1 > 0) & (col2 <= 20)', step=5)
... if your_function(row['col2']) ]
>>> print "Values that pass the cuts:", passvalues
Note that, from PyTables 1.1 on, you can nest several iterators over the same table. For example:
for p in rout.where('pressure < 16'):
for q in rout.where('pressure < 9'):
for n in rout.where('energy < 10'):
print "pressure, energy:", p['pressure'], n['energy']
In this example, iterators returned by Table.where() have been used, but you may as well use any of the other reading iterators that Table objects offer. See the file examples/nested-iter.py for the full code.
Changed in version 3.0.
The start, stop and step parameters now behave like in slice.
Append rows fulfilling the condition to the dstTable table.
dstTable must be capable of taking the rows resulting from the query, i.e. it must have columns with the expected names and compatible types. The meaning of the other arguments is the same as in the Table.where() method.
The number of rows appended to dstTable is returned as a result.
Changed in version 3.0: The whereAppend method has been renamed into append_where.
Will a query for the condition use indexing?
The meaning of the condition and condvars arguments is the same as in the Table.where() method. If condition can use indexing, this method returns a frozenset with the path names of the columns whose index is usable. Otherwise, it returns an empty list.
This method is mainly intended for testing. Keep in mind that changing the set of indexed columns or their dirtiness may make this method return different values for the same arguments at different times.
Copy this table and return the new one.
This method has the behavior and keywords described in Leaf.copy(). Moreover, it recognises the following additional keyword arguments.
Parameters : | sortby :
checkCSI :
propindexes :
|
---|
Add remaining rows in buffers to non-dirty indexes.
This can be useful when you have chosen non-automatic indexing for the table (see the Table.autoindex property in Table) and you want to update the indexes on it.
Get the enumerated type associated with the named column.
If the column named colname (a string) exists and is of an enumerated type, the corresponding Enum instance (see The Enum class) is returned. If it is not of an enumerated type, a TypeError is raised. If the column does not exist, a KeyError is raised.
Recompute all the existing indexes in the table.
This can be useful when you suspect that, for any reason, the index information for columns is no longer valid and want to rebuild the indexes on it.
Recompute the existing indexes in table, if they are dirty.
This can be useful when you have set Table.autoindex (see Table) to false for the table and you want to update the indexes after a invalidating index operation (Table.remove_rows(), for example).
This class represents descriptions of the structure of tables.
An instance of this class is automatically bound to Table (see The Table class) objects when they are created. It provides a browseable representation of the structure of the table, made of non-nested (Col - see The Col class and its descendants) and nested (Description) columns.
Column definitions under a description can be accessed as attributes of it (natural naming). For instance, if table.description is a Description instance with a column named col1 under it, the later can be accessed as table.description.col1. If col1 is nested and contains a col2 column, this can be accessed as table.description.col1.col2. Because of natural naming, the names of members start with special prefixes, like in the Group class (see The Group class).
Description attributes
A dictionary mapping the names of the columns hanging directly from the associated table or nested column to their respective descriptions (Col - see The Col class and its descendants or Description - see The Description class instances).
Changed in version 3.0: The _v_colObjects attobute has been renamed into _v_colobjects.
A dictionary mapping the names of non-nested columns hanging directly from the associated table or nested column to their respective default values.
The NumPy type which reflects the structure of this table or nested column. You can use this as the dtype argument of NumPy array factories.
A dictionary mapping the names of non-nested columns hanging directly from the associated table or nested column to their respective NumPy types.
Whether the associated table or nested column contains further nested columns or not.
The size in bytes of an item in this table or nested column.
The name of this description group. The name of the root group is ‘/’.
A list of the names of the columns hanging directly from the associated table or nested column. The order of the names matches the order of their respective columns in the containing table.
A nested list of pairs of (name, format) tuples for all the columns under this table or nested column. You can use this as the dtype and descr arguments of NumPy array factories.
Changed in version 3.0: The _v_nestedDescr attribute has been renamed into _v_nested_descr.
A nested list of the NumPy string formats (and shapes) of all the columns under this table or nested column. You can use this as the formats argument of NumPy array factories.
Changed in version 3.0: The _v_nestedFormats attribute has been renamed into _v_nested_formats.
The level of the associated table or nested column in the nested datatype.
A nested list of the names of all the columns under this table or nested column. You can use this as the names argument of NumPy array factories.
Changed in version 3.0: The _v_nestedNames attribute has been renamed into _v_nested_names.
Pathname of the table or nested column.
A list of the pathnames of all the columns under this table or nested column (in preorder). If it does not contain nested columns, this is exactly the same as the Description._v_names attribute.
A dictionary mapping the names of non-nested columns hanging directly from the associated table or nested column to their respective PyTables types.
Iterate over nested columns.
If type is ‘All’ (the default), all column description objects (Col and Description instances) are yielded in top-to-bottom order (preorder).
If type is ‘Col’ or ‘Description’, only column descriptions of that type are yielded.
Table row iterator and field accessor.
Instances of this class are used to fetch and set the values of individual table fields. It works very much like a dictionary, where keys are the pathnames or positions (extended slicing is supported) of the fields in the associated table in a specific row.
This class provides an iterator interface so that you can use the same Row instance to access successive table rows one after the other. There are also some important methods that are useful for accessing, adding and modifying values in tables.
Row attributes
The current row number.
This property is useful for knowing which row is being dealt with in the middle of a loop or iterator.
Add a new row of data to the end of the dataset.
Once you have filled the proper fields for the current row, calling this method actually appends the new data to the output buffer (which will eventually be dumped to disk). If you have not set the value of a field, the default value of the column will be used.
Warning
After completion of the loop in which Row.append() has been called, it is always convenient to make a call to Table.flush() in order to avoid losing the last rows that may still remain in internal buffers.
Examples
row = table.row
for i in xrange(nrows):
row['col1'] = i-1
row['col2'] = 'a'
row['col3'] = -1.0
row.append()
table.flush()
Retrieve all the fields in the current row.
Contrarily to row[:] (see Row special methods), this returns row data as a NumPy void scalar. For instance:
[row.fetch_all_fields() for row in table.where('col1 < 3')]
will select all the rows that fulfill the given condition as a list of NumPy records.
Change the data of the current row in the dataset.
This method allows you to modify values in a table when you are in the middle of a table iterator like Table.iterrows() or Table.where().
Once you have filled the proper fields for the current row, calling this method actually changes data in the output buffer (which will eventually be dumped to disk). If you have not set the value of a field, its original value will be used.
Warning
After completion of the loop in which Row.update() has been called, it is always convenient to make a call to Table.flush() in order to avoid losing changed rows that may still remain in internal buffers.
Examples
for row in table.iterrows(step=10):
row['col1'] = row.nrow
row['col2'] = 'b'
row['col3'] = 0.0
row.update()
table.flush()
which modifies every tenth row in table. Or:
for row in table.where('col1 > 3'):
row['col1'] = row.nrow
row['col2'] = 'b'
row['col3'] = 0.0
row.update()
table.flush()
which just updates the rows with values bigger than 3 in the first column.
A true value is returned if item is found in current row, false otherwise.
Get the row field specified by the key.
The key can be a string (the name of the field), an integer (the position of the field) or a slice (the range of field positions). When key is a slice, the returned value is a tuple containing the values of the specified fields.
Examples
res = [row['var3'] for row in table.where('var2 < 20')]
which selects the var3 field for all the rows that fulfil the condition. Or:
res = [row[4] for row in table if row[1] < 20]
which selects the field in the 4th position for all the rows that fulfil the condition. Or:
res = [row[:] for row in table if row['var2'] < 20]
which selects the all the fields (in the form of a tuple) for all the rows that fulfil the condition. Or:
res = [row[1::2] for row in table.iterrows(2, 3000, 3)]
which selects all the fields in even positions (in the form of a tuple) for all the rows in the slice [2:3000:3].
Set the key row field to the specified value.
Differently from its __getitem__() counterpart, in this case key can only be a string (the name of the field). The changes done via __setitem__() will not take effect on the data on disk until any of the Row.append() or Row.update() methods are called.
Examples
for row in table.iterrows(step=10):
row['col1'] = row.nrow
row['col2'] = 'b'
row['col3'] = 0.0
row.update()
table.flush()
which modifies every tenth row in the table.
Container for columns in a table or nested column.
This class is used as an accessor to the columns in a table or nested column. It supports the natural naming convention, so that you can access the different columns as attributes which lead to Column instances (for non-nested columns) or other Cols instances (for nested columns).
For instance, if table.cols is a Cols instance with a column named col1 under it, the later can be accessed as table.cols.col1. If col1 is nested and contains a col2 column, this can be accessed as table.cols.col1.col2 and so on. Because of natural naming, the names of members start with special prefixes, like in the Group class (see The Group class).
Like the Column class (see The Column class), Cols supports item access to read and write ranges of values in the table or nested column.
Cols attributes
A list of the names of the columns hanging directly from the associated table or nested column. The order of the names matches the order of their respective columns in the containing table.
A list of the pathnames of all the columns under the associated table or nested column (in preorder). If it does not contain nested columns, this is exactly the same as the Cols._v_colnames attribute.
The associated Description instance (see The Description class).
The parent Table instance (see The Table class).
Get an accessor to the column colname.
This method returns a Column instance (see The Column class) if the requested column is not nested, and a Cols instance (see The Cols class) if it is. You may use full column pathnames in colname.
Calling cols._f_col(‘col1/col2’) is equivalent to using cols.col1.col2. However, the first syntax is more intended for programmatic use. It is also better if you want to access columns with names that are not valid Python identifiers.
Get a row or a range of rows from a table or nested column.
If key argument is an integer, the corresponding nested type row is returned as a record of the current flavor. If key is a slice, the range of rows determined by it is returned as a structured array of the current flavor.
Examples
record = table.cols[4] # equivalent to table[4]
recarray = table.cols.Info[4:1000:2]
Those statements are equivalent to:
nrecord = table.read(start=4)[0]
nrecarray = table.read(start=4, stop=1000, step=2).field('Info')
Here you can see how a mix of natural naming, indexing and slicing can be used as shorthands for the Table.read() method.
Get the number of top level columns in table.
Set a row or a range of rows in a table or nested column.
If key argument is an integer, the corresponding row is set to value. If key is a slice, the range of rows determined by it is set to value.
Examples
table.cols[4] = record
table.cols.Info[4:1000:2] = recarray
Those statements are equivalent to:
table.modify_rows(4, rows=record)
table.modify_column(4, 1000, 2, colname='Info', column=recarray)
Here you can see how a mix of natural naming, indexing and slicing can be used as shorthands for the Table.modify_rows() and Table.modify_column() methods.
Accessor for a non-nested column in a table.
Each instance of this class is associated with one non-nested column of a table. These instances are mainly used to read and write data from the table columns using item access (like the Cols class - see The Cols class), but there are a few other associated methods to deal with indexes.
Column attributes
The Description (see The Description class) instance of the parent table or nested column.
The name of the associated column.
The complete pathname of the associated column (the same as Column.name if the column is not inside a nested column).
Parameters : | table :
name :
descr :
|
---|
The NumPy dtype that most closely matches this column.
The Index instance (see The Index class) associated with this column (None if the column is not indexed).
True if the column is indexed, false otherwise.
“The dimension along which iterators work. Its value is 0 (i.e. the first dimension).
The shape of this column.
The parent Table instance (see The Table class).
The PyTables type of the column (a string).
Create an index for this column.
Warning
In some situations it is useful to get a completely sorted index (CSI). For those cases, it is best to use the Column.create_csindex() method instead.
Parameters : | optlevel : int
kind : str
filters : Filters
tmp_dir :
|
---|
Create a completely sorted index (CSI) for this column.
This method guarantees the creation of an index with zero entropy, that is, a completely sorted index (CSI) – provided that the number of rows in the table does not exceed the 2**48 figure (that is more than 100 trillions of rows). A CSI index is needed for some table methods (like Table.itersorted() or Table.read_sorted()) in order to ensure completely sorted results.
For the meaning of filters and tmp_dir arguments see Column.create_index().
Notes
This method is equivalent to Column.create_index(optlevel=9, kind=’full’, ...).
Recompute the index associated with this column.
This can be useful when you suspect that, for any reason, the index information is no longer valid and you want to rebuild it.
This method does nothing if the column is not indexed.
Recompute the associated index only if it is dirty.
This can be useful when you have set Table.autoindex to false for the table and you want to update the column’s index after an invalidating index operation (like Table.remove_rows()).
This method does nothing if the column is not indexed.
Remove the index associated with this column.
This method does nothing if the column is not indexed. The removed index can be created again by calling the Column.create_index() method.
Get a row or a range of rows from a column.
If key argument is an integer, the corresponding element in the column is returned as an object of the current flavor. If key is a slice, the range of elements determined by it is returned as an array of the current flavor.
Examples
print "Column handlers:"
for name in table.colnames:
print table.cols._f_col(name)
print "Select table.cols.name[1]-->", table.cols.name[1]
print "Select table.cols.name[1:2]-->", table.cols.name[1:2]
print "Select table.cols.name[:]-->", table.cols.name[:]
print "Select table.cols._f_col('name')[:]-->",
table.cols._f_col('name')[:]
The output of this for a certain arbitrary table is:
Column handlers:
/table.cols.name (Column(), string, idx=None)
/table.cols.lati (Column(), int32, idx=None)
/table.cols.longi (Column(), int32, idx=None)
/table.cols.vector (Column(2,), int32, idx=None)
/table.cols.matrix2D (Column(2, 2), float64, idx=None)
Select table.cols.name[1]--> Particle: 11
Select table.cols.name[1:2]--> ['Particle: 11']
Select table.cols.name[:]--> ['Particle: 10'
'Particle: 11' 'Particle: 12'
'Particle: 13' 'Particle: 14']
Select table.cols._f_col('name')[:]--> ['Particle: 10'
'Particle: 11' 'Particle: 12'
'Particle: 13' 'Particle: 14']
See the examples/table2.py file for a more complete example.
Get the number of elements in the column.
This matches the length in rows of the parent table.
Set a row or a range of rows in a column.
If key argument is an integer, the corresponding element is set to value. If key is a slice, the range of elements determined by it is set to value.
Examples
# Modify row 1
table.cols.col1[1] = -1
# Modify rows 1 and 3
table.cols.col1[1::2] = [2,3]
Which is equivalent to:
# Modify row 1
table.modify_columns(start=1, columns=[[-1]], names=['col1'])
# Modify rows 1 and 3
columns = numpy.rec.fromarrays([[2,3]], formats='i4')
table.modify_columns(start=1, step=2, columns=columns,
names=['col1'])