fbi_core module¶
- fbi_core.all_under_query(path, location=None, name_regex=None, include_removed=False, item_type=None, ext=None, since=None, before=None, audited_since=None, audited_before=None, corrupt_since=None, corrupt_before=None, with_field=None, without=None, blank=None, maxsize=None, minsize=None, fileset=None, after=None, stop=None)¶
Make elastic search query for FBI records.
- Parameters:
path (str) – The path to search under.
location (str) – Media location, either on_disk or on_tape. Default is all locations.
name_regex (str) – A regular expression to match against the file or directory name.
include_removed (bool) – Flag to include removed items in the search.
item_type (str) – Item type for the record. Either “file”, “dir” or “link”.
ext (str) – Search on extention type. e.g. “.nc”.
since (str) – Search for items modified since this iso formated datetime.
before (str) – Search for items modified before this iso formated datetime.
audited_since (str) – Search for items audited since this iso formated datetime.
audited_before (str) – Search for items audited before this iso formated datetime.
corrupt_since (str) – Search for items corrupt since this iso formated datetime.
corrupt_before (str) – Search for items corrupt before this iso formated datetime.
with_field (str) – Search for items where this field exists.
without (str) – Search for items where this field does not exist.
blank (str) – Search for items where this field is an empty string.
maxsize (int) – Search for items smaller than this size in bytes.
minsize (int) – Search for items larger than this size in bytes.
fileset (str) – Search for items in a fileset.
after (str) – Search items where path is lexically after this.
stop (str) – Search items where path is lexically before this.
- Return dict:
Elasticsearch query which could be used by the elacticsearch client.
- fbi_core.archive_summary(path, max_types=5, max_vars=1000, max_exts=10, include_removed=False, **kwargs)¶
find summary info for the archive below a path.
- fbi_core.bulk_update(records)¶
Update a list of records
- fbi_core.fbi_listdir(directory, fetch_size=10000, dirs_only=False, removed=False, hidden=True)¶
FBI record list for a directory
- fbi_core.fbi_records(after='/', stop='~', fetch_size=10000, exclude_phenomena=False, item_type=None, **kwargs)¶
FBI record iterator. The is implicitly in path order.
- Parameters:
after (str) – paths after this are iterated over. Defaults to “/”
stop (str) – iteration stops when the path is greater than or equal to this. Defaults to “~”
fetch_size (int) – The number of records to request from elasticsearch at a time.
exclude_phenomena (bool) – remove the bulky phenomena attribute from the record. Default is False.
item_type (str) – Item type for the records. Either “file”, “dir” or “link”. Defaults to all types.
- Return iterator[dict]:
Yeilds FBI records as dictionaries.
- fbi_core.fbi_records_under(path='/', fetch_size=10000, exclude_phenomena=False, **kwargs)¶
FBI record iterator in path order
- fbi_core.flag_removed(record)¶
Mark a file as removed by adding a removed date.
- fbi_core.get_records_by_content(md5, filename=None, under=None, include_removed=False)¶
Get records with content that matches an md5. Optionaly make it match a filename and a parent directory.
- fbi_core.insert_item(record)¶
Insert record by replaceing it
- fbi_core.lastest_file(directory)¶
latest file record of last updated file under a path.
- Parameters:
directory (str) – path to search for last updated file
- Return dict or None:
Record for the last updated file.
- fbi_core.ls_query(path, size=10000, **kwargs)¶
ls for fbi
- Parameters:
path (str) –
**kwrargs –
Any options from all_under_query
- Return list[dict]:
FBI records.
- fbi_core.make_dirs(directory)¶
Make FBI records for a diretory and any missing parent directories.
- Parameters:
directory (str) – The directory to add.
- fbi_core.nla_dirs(after='/', stop='/~', fetch_size=10000)¶
FBI record iterator for nla directories
- fbi_core.update_file_location(path_list, location)¶
Mark list of paths as on media type. This is for the NLA system to update media movements.
- Parameters:
pathlist (list) – A list of paths to mark up.
location (str) – “on_disk”, “on_tape” or “on_obstore”.
- fbi_core.update_item(record)¶
Update a single document - overwrite feilds in record suplied.
- fbi_core.where_is(name, fetch_size=10000, removed=False)¶
retrun records for items named