numpy.fromregex — NumPy v1.15 Manual (original) (raw)

numpy. fromregex(file, regexp, dtype, encoding=None)[source]

Construct an array from a text file, using regular expression parsing.

The returned array is always a structured array, and is constructed from all matches of the regular expression in the file. Groups in the regular expression are converted to fields of the structured array.

Parameters: file : str or file File name or file object to read. regexp : str or regexp Regular expression used to parse the file. Groups in the regular expression correspond to fields in the dtype. dtype : dtype or list of dtypes Dtype for the structured array. encoding : str, optional Encoding used to decode the inputfile. Does not apply to input streams. New in version 1.14.0.
Returns: output : ndarray The output array, containing the part of the content of file that was matched by regexp. output is always a structured array.
Raises: TypeError When dtype is not a valid dtype for a structured array.

Notes

Dtypes for structured arrays can be specified in several forms, but all forms specify at least the data type and field name. For details seedoc.structured_arrays.

Examples

f = open('test.dat', 'w') f.write("1312 foo\n1534 bar\n444 qux") f.close()

regexp = r"(\d+)\s+(...)" # match [digits, whitespace, anything] output = np.fromregex('test.dat', regexp, ... [('num', np.int64), ('key', 'S3')]) output array([(1312L, 'foo'), (1534L, 'bar'), (444L, 'qux')], dtype=[('num', '<i8'), ('key', '|S3')]) output['num'] array([1312, 1534, 444], dtype=int64)