fbi_core module

fbi_core.all_under_query(path, location=None, name_regex=None, include_removed=False, item_type=None, ext=None, since=None, before=None, audited_since=None, audited_before=None, corrupt_since=None, corrupt_before=None, with_field=None, without=None, blank=None, maxsize=None, minsize=None, fileset=None, after=None, stop=None)

Make elastic search query for FBI records.

Parameters:
  • path (str) – The path to search under.

  • location (str) – Media location, either on_disk or on_tape. Default is all locations.

  • name_regex (str) – A regular expression to match against the file or directory name.

  • include_removed (bool) – Flag to include removed items in the search.

  • item_type (str) – Item type for the record. Either “file”, “dir” or “link”.

  • ext (str) – Search on extention type. e.g. “.nc”.

  • since (str) – Search for items modified since this iso formated datetime.

  • before (str) – Search for items modified before this iso formated datetime.

  • audited_since (str) – Search for items audited since this iso formated datetime.

  • audited_before (str) – Search for items audited before this iso formated datetime.

  • corrupt_since (str) – Search for items corrupt since this iso formated datetime.

  • corrupt_before (str) – Search for items corrupt before this iso formated datetime.

  • with_field (str) – Search for items where this field exists.

  • without (str) – Search for items where this field does not exist.

  • blank (str) – Search for items where this field is an empty string.

  • maxsize (int) – Search for items smaller than this size in bytes.

  • minsize (int) – Search for items larger than this size in bytes.

  • fileset (str) – Search for items in a fileset.

  • after (str) – Search items where path is lexically after this.

  • stop (str) – Search items where path is lexically before this.

Return dict:

Elasticsearch query which could be used by the elacticsearch client.

fbi_core.archive_summary(path, max_types=5, max_vars=1000, max_exts=10, include_removed=False, **kwargs)

find summary info for the archive below a path.

fbi_core.bulk_update(records)

Update a list of records

fbi_core.fbi_listdir(directory, fetch_size=10000, dirs_only=False, removed=False, hidden=True)

FBI record list for a directory

fbi_core.fbi_records(after='/', stop='~', fetch_size=10000, exclude_phenomena=False, item_type=None, **kwargs)

FBI record iterator. The is implicitly in path order.

Parameters:
  • after (str) – paths after this are iterated over. Defaults to “/”

  • stop (str) – iteration stops when the path is greater than or equal to this. Defaults to “~”

  • fetch_size (int) – The number of records to request from elasticsearch at a time.

  • exclude_phenomena (bool) – remove the bulky phenomena attribute from the record. Default is False.

  • item_type (str) – Item type for the records. Either “file”, “dir” or “link”. Defaults to all types.

Return iterator[dict]:

Yeilds FBI records as dictionaries.

fbi_core.fbi_records_under(path='/', fetch_size=10000, exclude_phenomena=False, **kwargs)

FBI record iterator in path order

fbi_core.flag_removed(record)

Mark a file as removed by adding a removed date.

fbi_core.get_records_by_content(md5, filename=None, under=None, include_removed=False)

Get records with content that matches an md5. Optionaly make it match a filename and a parent directory.

fbi_core.insert_item(record)

Insert record by replaceing it

fbi_core.lastest_file(directory)

latest file record of last updated file under a path.

Parameters:

directory (str) – path to search for last updated file

Return dict or None:

Record for the last updated file.

fbi_core.ls_query(path, size=10000, **kwargs)

ls for fbi

Parameters:
  • path (str) –

  • **kwrargs

    Any options from all_under_query

Return list[dict]:

FBI records.

fbi_core.make_dirs(directory)

Make FBI records for a diretory and any missing parent directories.

Parameters:

directory (str) – The directory to add.

fbi_core.nla_dirs(after='/', stop='/~', fetch_size=10000)

FBI record iterator for nla directories

fbi_core.update_file_location(path_list, location)

Mark list of paths as on media type. This is for the NLA system to update media movements.

Parameters:
  • pathlist (list) – A list of paths to mark up.

  • location (str) – “on_disk”, “on_tape” or “on_obstore”.

fbi_core.update_item(record)

Update a single document - overwrite feilds in record suplied.

fbi_core.where_is(name, fetch_size=10000, removed=False)

retrun records for items named