src.data.common

Created on Fri Apr 30 15:13:32 2021

@author: Paolo Cozzi <paolo.cozzi@ibba.cnr.it>

Common stuff for smarter scripts

class src.data.common.AssemblyConf(version, imported_from)

Bases: tuple

imported_from

Alias for field number 1

version

Alias for field number 0

src.data.common.deal_with_datasets(src_dataset: str, dst_dataset: str, datafile: str) [<class 'src.features.smarterdb.Dataset'>, <class 'src.features.smarterdb.Dataset'>, <class 'pathlib.Path'>][source]

Check source and destination dataset with its content

src.data.common.deal_with_sex_and_alias(sex_column: str, alias_column: str, row: Series)[source]

Deal with sex and alias parameters

Parameters
  • sex_column (str) – The sex column label.

  • alias_column (str) – The alias column label.

  • row (Series) – A row of metadata file.

Returns

  • sex (SEX) – A SEX instance.

  • alias (str) – The alias read from metadata table could be None.

src.data.common.fetch_and_check_dataset(archive: str, contents: list[str]) [<class 'src.features.smarterdb.Dataset'>, list[pathlib.Path]][source]

Common operations on dataset: fetch a dataset by file (submitted archive), check that working dir exists and required file contents is in dataset. Test and get full path of required files

Parameters
  • archive (str) – the dataset archive (file)

  • contents (list) – a list of files which beed to be defined in dataset

Returns

a dataset instance list[Path]: a list of Path of required files

Return type

Dataset

src.data.common.get_sample_species(species: str) Union[SampleSheep, SampleGoat][source]

Get a species name in input. It return the proper SampleSpecies class

Parameters

species (str) – the species name

Returns

a SampleSpecies class

Return type

Union[SampleSheep, SampleGoat]

src.data.common.get_variant_species(species: str) Union[VariantSheep, VariantGoat][source]

Get a species name in input. It return the proper VariantSpecies class

Parameters

species (str) – the species name

Returns

a VariantSpecies class

Return type

Union[VariantSheep, VariantGoat]

src.data.common.new_variant(variant: Union[VariantSheep, VariantGoat], location: Location)[source]
src.data.common.pandas_open(datapath: Path, **kwargs) DataFrame[source]

Open an excel or csv file with pandas and returns a dataframe

Parameters
  • datapath (Path) – the path of the file

  • kwargs (dict) – additional pandas options

Returns

file content as a pandas dataframe

Return type

pd.DataFrame

src.data.common.update_affymetrix_record(variant: Union[VariantSheep, VariantGoat], record: Union[VariantSheep, VariantGoat]) [typing.Union[src.features.smarterdb.VariantSheep, src.features.smarterdb.VariantGoat], <class 'bool'>][source]
src.data.common.update_chip_name(variant: Union[VariantSheep, VariantGoat], record: Union[VariantSheep, VariantGoat]) [typing.Union[src.features.smarterdb.VariantSheep, src.features.smarterdb.VariantGoat], <class 'bool'>][source]
src.data.common.update_location(location: Location, variant: Union[VariantSheep, VariantGoat], force_update: bool = False) [typing.Union[src.features.smarterdb.VariantSheep, src.features.smarterdb.VariantGoat], <class 'bool'>][source]

Check provided Location with variant Locations: append new object or update a Location object if more recent than the data stored in database

Parameters
  • location (Location) – The location to test against the database.

  • variant (Union[VariantSheep, VariantGoat]) – The variant to test.

  • force_update (bool, optional) – Force location update. The default is False.

Returns

A list with the updated VariantSpecie object and a boolean value which is True if the location was updated

Return type

[Union[VariantSheep, VariantGoat], bool]

src.data.common.update_probesets(variant_attr: list[src.features.smarterdb.Probeset], record_attr: list[src.features.smarterdb.Probeset]) bool[source]

Update probeset relying on object references

src.data.common.update_rs_id(variant: Union[VariantSheep, VariantGoat], record: Union[VariantSheep, VariantGoat]) [typing.Union[src.features.smarterdb.VariantSheep, src.features.smarterdb.VariantGoat], <class 'bool'>][source]
src.data.common.update_sequence(variant: Union[VariantSheep, VariantGoat], record: Union[VariantSheep, VariantGoat]) [typing.Union[src.features.smarterdb.VariantSheep, src.features.smarterdb.VariantGoat], <class 'bool'>][source]
src.data.common.update_variant(qs: QuerySet, variant: Union[VariantSheep, VariantGoat], location: Location) bool[source]

Update an existing variant (if necessary)