History

TODO

Check chromosomes in Variants locations: mind to scaffold, null, and non-autosomal chromosomes for Goat and Sheep
Rename objects (use names in a consistent way, ex TOP, BOT)
Release a smarter coordinate version with information on every variant defined in database (which will be used as reference)
Map affymetrix snps in OARV3 coordinates
Check if rs_id is still valid or not (with EVA)
Manage python packages with poetry
Rename manifacturer into manufacturer

0.4.9 (2023-09-27)

Load phenotypes for Fosses, Provencale goat breeds
Add sex for Fosses, Provencale goat breeds
Add sex while importing metadata
Load multiple phenotypes for Boutsko foreground sheeps
Add multiple phenotypes as a list (103)
Update datasets metadata
Update dependencies

0.4.8 (2023-06-28)

Capitalize species_class parameter in src.data.import_breeds.py
Generate output files for OARV4 and CHIR1 (#87)
Import data from dbSNP152 (#15)
Import data from IGGC (#18)
Split import_consortium.py in import_isgc.py and import_iggc.py to import data from Sheep and Goat genome consortia respectively
Force data update when importing from consortium
Track date when importing from consortium
Determine illumina_top data directly from variant for Sheep when importing from consortium data
Uniform note metadata field (add a note parameters in import metadata)
Import data from Cortellari et al 2021 (https://doi.org/10.1038/s41598-021-89900-2)
Import data from Burren et al 2016 (https://doi.org/10.1111/age.12476)
Revise illumina A/B genotype tracking
Import from Illumina report with only 3 columns in SNP list file
Update dependencies

0.4.7 (2022-12-23)

Import background data from Gaouar et al 2017 (https://doi.org/10.1038/hdy.2016.86)
Import from plink with illumina coding (as specified in manifest: not top nor forward)
Import background data from Belabdi et al 2019 (https://doi.org/10.1038/s41598-019-44137-y)
Import background data from Ciani et al 2020 (https://doi.org/10.1186/s12711-020-00545-7)
Import background data from Barbato et al 2017 (https://doi.org/10.1038/s41598-017-07382-7)
Update species for european mouflon
Support species update with import_metadata.py
Import 18 welsh breed as background genotypes
Rename two welsh breeds
Model doi in datasets
Upgrade CI workflows to actions/cache@v3
Add SNPconvert.py script
Import genotypes of other WPs coming from Uruguay
Deal with affymetrix report with less SNPs than declared
Add an option to skip coordinate check when importing affymetrix report
Import from affymetrix a limited number of samples
Skip sample creation when there’s no alias
Support for missing columns in affymetrix report files
Support invalid python names in src.features.affymetrix.read_affymetrixRow
Update requirements
Deal with missing files in import_datasets.py
Update Uruguay metadata locations
Move Galway sheep to Ireland country (Ovine HapMap)

0.4.6 (2022-09-26)

Update requirements
Read from affymetrix A/B reportfile
Import latest Uruguayan data (#65)
Configure database connection (#66)
Update sex in ped file if there are information in database
Enable continuous integration for documentation (ReadTheDocs)
Update documentation
Track full species information in Sample (support for multi-species sheep and goats)
Updated isheep exploration notebooks
Deal with unknown countries and species
Fix issues related on alias when creating samples or adding metadata
Fetch variants using positions
Import from plink using genomic coordinates
Import 50K, 600K and WGS isheep datasets (#47)
Fix issue in src.features.plinkio.plink_binary_exists
Code refactoring in src.features.plinkio
Import data from Sheep HapMap V2

0.4.5 (2022-06-14)

Update requirements
Import data from Hungary (#53)
Create a new sample when having the same original_id in dataset but for a different breed
illumina_top is an attribute of variant, and is set when the first location is loaded.
Check variants data before update (#56)
Simplified import_affymetrix script
Import custom affymetrix chips (Oar_v3.1)
Support source and destination assemblies when importing from plink or affymetrix source files
Deal with spaces in filenames while importing from plink
Add affy_snp_id primary key
Update import_affymetrix.py script
Import data from Spain (#52)
Fix 20220503 dataset breed and churra chip name
Track manifest probe sequence``s by ``chip_name
Track probeset_id by chip_name
Search for affymetrix probeset_id in the proper chip_name while importing samples
Track multiple rs_id
Fetch churra coordinates by rs_id and probeset_id and filter out unmanaged SNPs
If src_dataset and dst_dataset are equals, provide only src_dataset

0.4.4 (2022-02-28)

Model location with MultiPointField
Describe smarter metadata
Import sweden goat metadata
Import latest 290 samples greek dataset
Fix issue with greek samples name (B273 converted into B273A)
Add latest 19 sheep greek samples
Add a country collection
Update dependencies

0.4.3 (2021-11-11)

Add 270 Frizarta background samples
Import from ab plink and support multiple missing letters
Track database status and constants
Add foreground/background type attribute in SampleSpecies
Update dependencies
Add make rule to pack results and make checksum
Move greek foreground metadata to a custom phenotypes dataset
Update greek foreground metadata
Import phenotypes from Uruguay
Import phenotypes using alias
Allow phenotypes for ambiguous sex animals
Import french goat foreground dataset
Pin plinkio to support extra-chroms in plink binary files
Import 5 Sweden Sheep background genotypes
Force half-missing SNPs to be MISSING
Add the README.txt.ftp
Bug fixed in importing multibreed reportfile (setting FID properly in output)

0.4.2 (2021-08-27)

Set nullable ListField for sample locations and variant consequences
Capitalize phenotype values (ie milk -> Milk)
Import greek chios-mytilini-boutsko sheep dataset
Track multiple location for sample (deal with transhumant breeds )
Import greek skopelios-eghoria goat dataset
Use sample data to deal with multi breeds illumina row files
Determine fid from database with IlluminaReportIO
Import greek frizarta-chios-pelagonia sheep dataset
Import greek frizarta-chios sheep dataset
Import sweden foreground goat dataset
Update ADAPTmap breed names and phenotypes import
Check that breed exists while inserting phenotype data
Import french foreground sheep dataset
Use elemMatch in projection in plinkio.SmarterMixin.fetch_coordinates (ex: VariantSheep.objects.fields(elemMatch__locations={"imported_from": "SNPchiMp v.3", "version": "Oar_v4.0"}))
Use elemMatch to search a SNP within the desired coordinate systems in plinkio.SmarterMixin.fetch_coordinates
Skip SNPchimp indels when importing from SNPchimp
Skip illumina indels when reading from manifest

0.4.1 (2021-09-08)

Add chip_name in Dataset (database value, not user value)
Skip null fields when importing datasets
Import uruguay sheep affymetrix data
Import from affymetrix dataset
Rely on original affymetrix coordinate system to determine illumina top alleles
Search samples aliases while importing genotypes
Clearly state when creating samples (ignore samples if not defined in database)
Track sample aliases for original_id
Import samples from file by providing country and breeds values as parameters
Import sheep coordinates from genome project
Security updates
Fix github Workflow

0.4.0 (2021-06-18)

dbSNP feature library refactor
fix linter issues
Transform affymetrix unmapped chrom to 0
Transform SNPchiMp unmapped chroms to 0
ignore affymetrix insertions and deletions
join affymetrix data with illumina relying on cust_id
define illumina_top from affymetrix flanking sequences
load data from affymetrix manifest
calculate illumina_top from affymetrix sequence
Test import data from snpchimp
Import OARV4 coordinates
data/common module refactoring
Fix bug in importing dataset order
Model affymetrix fields
Read from affymetrix manifest file
Track illumina manufactured date

0.3.1 (2021-06-11)

Upgrade dependencies
Enable continuous integration
- Github Workflow
- Coverage

0.3.0 (2021-05-19)

Deal with multi-sheets .xlsx documents
Import phenotypes (from a source dataset to a destination dataset)
Define phenotype attribute as a mongoengine.DynamicDocument field
Import metadata or phenotype by breeds or by samples
Import metadata (from a source dataset to a destination dataset)
Forcing plink chrom options when converting in binary formats
import data from ADAPTmap project
- Import goat breeds (from a source dataset to a destination dataset)
- Import goat data from plink files
- Import goat metadata
Import goat data from manifest and snpchimp
configure mongodb-express credentials
Add Goat Related tables
- add variantGoat collection
- add sampleGoat collection

0.2.3 (2021-05-03)

Unset ped columns if relationship can’t be derived from data (ex. brazilian BSI)
Deal with geographical coordinates
Add features to samples (relying on metadata file)

0.2.2 (2021-04-29)

Breed name should be a unique key within species
make rule to clean-up interim data
skip already processed file from import
Deal with mother_id and father_id (search for smarter_id in database)
Deal with multi-countries dataset
- track country in aliases while importing breeds from dataset

0.2.1 (2021-04-22)

Track chip_name with samples
Deal with binary plink files
Search breed by aliases used in dataset:
- match fid with breed aliases in dataset
- store aliases by dataset
Add breeds from .xlsx files

0.2.0 (2021-04-15)

Merge multiple files per dataset
Import from an illumina report file
Deal with AB allele coding
Deal with plink text files using modules
Fix SNPchiMp data import
Determine illumina_top coding as a property relying on database data
Support multi-manifest upload (extend database with HD chip)
Deal with compressed manifest
Add breeds with CLI
Check coordinates format relying on DRM
Test stuff with mongomock

0.1.0 (2021-03-29)

Start with project documentation
Explore background datasets
Merge plink binary files
Convert from forward to illumina_top coordinates
Convert to plink binary format
Manage database credentials
Import samples into smarter database while fixing coordinates and genotypes
Configure tox and sphinx environments
Model breeds in smarter database
Import datasets into database
Read from dbSNP xml dump file
Import SNPchiMp data into smarter database
Import Illumina manifest data into database
Model objects with mongoengine
Model smarter ids
Configure environments, requirements and dependencies