History
TODO
Check chromosomes in Variants locations: mind to scaffold, null, and non-autosomal chromosomes for Goat and Sheep
Rename objects (use names in a consistent way, ex TOP, BOT)
Release a smarter coordinate version with information on every variant defined in database (which will be used as reference)
Map affymetrix snps in OARV3 coordinates
Check if
rs_id
is still valid or not (with EVA)Manage python packages with poetry
Rename
manifacturer
intomanufacturer
0.4.9 (2023-09-27)
Load phenotypes for Fosses, Provencale goat breeds
Add sex for Fosses, Provencale goat breeds
Add sex while importing metadata
Load multiple phenotypes for Boutsko foreground sheeps
Add multiple phenotypes as a list (103)
Update datasets metadata
Update dependencies
0.4.8 (2023-06-28)
Capitalize
species_class
parameter insrc.data.import_breeds.py
Generate output files for OARV4 and CHIR1 (#87)
Import data from dbSNP152 (#15)
Import data from IGGC (#18)
Split
import_consortium.py
inimport_isgc.py
andimport_iggc.py
to import data from Sheep and Goat genome consortia respectivelyForce data update when importing from consortium
Track date when importing from consortium
Determine
illumina_top
data directly from variant for Sheep when importing from consortium dataUniform note metadata field (add a note parameters in import metadata)
Import data from Cortellari et al 2021 (https://doi.org/10.1038/s41598-021-89900-2)
Import data from Burren et al 2016 (https://doi.org/10.1111/age.12476)
Revise illumina A/B genotype tracking
Import from Illumina report with only 3 columns in SNP list file
Update dependencies
0.4.7 (2022-12-23)
Import background data from Gaouar et al 2017 (https://doi.org/10.1038/hdy.2016.86)
Import from plink with illumina coding (as specified in manifest: not top nor forward)
Import background data from Belabdi et al 2019 (https://doi.org/10.1038/s41598-019-44137-y)
Import background data from Ciani et al 2020 (https://doi.org/10.1186/s12711-020-00545-7)
Import background data from Barbato et al 2017 (https://doi.org/10.1038/s41598-017-07382-7)
Update species for european mouflon
Support species update with
import_metadata.py
Import 18 welsh breed as background genotypes
Rename two welsh breeds
Model doi in datasets
Upgrade CI workflows to
actions/cache@v3
Add
SNPconvert.py
scriptImport genotypes of other WPs coming from Uruguay
Deal with affymetrix report with less SNPs than declared
Add an option to skip coordinate check when importing affymetrix report
Import from affymetrix a limited number of samples
Skip sample creation when there’s no alias
Support for missing columns in affymetrix report files
Support invalid python names in
src.features.affymetrix.read_affymetrixRow
Update requirements
Deal with missing files in
import_datasets.py
Update Uruguay metadata locations
Move Galway sheep to Ireland country (Ovine HapMap)
0.4.6 (2022-09-26)
Update requirements
Read from affymetrix A/B reportfile
Import latest Uruguayan data (#65)
Configure database connection (#66)
Update sex in ped file if there are information in database
Enable continuous integration for documentation (ReadTheDocs)
Update documentation
Track full species information in Sample (support for multi-species sheep and goats)
Updated isheep exploration notebooks
Deal with unknown countries and species
Fix issues related on alias when creating samples or adding metadata
Fetch variants using positions
Import from plink using genomic coordinates
Import 50K, 600K and WGS isheep datasets (#47)
Fix issue in
src.features.plinkio.plink_binary_exists
Code refactoring in
src.features.plinkio
Import data from Sheep HapMap V2
0.4.5 (2022-06-14)
Update requirements
Import data from Hungary (#53)
Create a new sample when having the same
original_id
in dataset but for a different breedillumina_top
is an attribute of variant, and is set when the first location is loaded.Check variants data before update (#56)
Simplified
import_affymetrix
scriptImport custom affymetrix chips (Oar_v3.1)
Support source and destination assemblies when importing from plink or affymetrix source files
Deal with spaces in filenames while importing from plink
Add
affy_snp_id
primary keyUpdate
import_affymetrix.py
scriptImport data from Spain (#52)
Fix 20220503 dataset breed and churra chip name
Track manifest probe
sequence``s by ``chip_name
Track
probeset_id
bychip_name
Search for affymetrix
probeset_id
in the properchip_name
while importing samplesTrack multiple
rs_id
Fetch churra coordinates by
rs_id
andprobeset_id
and filter out unmanaged SNPsIf
src_dataset
anddst_dataset
are equals, provide onlysrc_dataset
0.4.4 (2022-02-28)
Model location with
MultiPointField
Describe smarter metadata
Import sweden goat metadata
Import latest 290 samples greek dataset
Fix issue with greek samples name (
B273
converted intoB273A
)Add latest 19 sheep greek samples
Add a country collection
Update dependencies
0.4.3 (2021-11-11)
Add 270 Frizarta background samples
Import from ab plink and support multiple missing letters
Track database status and constants
Add foreground/background type attribute in
SampleSpecies
Update dependencies
Add make rule to pack results and make checksum
Move greek foreground metadata to a custom phenotypes dataset
Update greek foreground metadata
Import phenotypes from Uruguay
Import phenotypes using alias
Allow phenotypes for ambiguous sex animals
Import french goat foreground dataset
Pin
plinkio
to support extra-chroms in plink binary filesImport 5 Sweden Sheep background genotypes
Force half-missing SNPs to be MISSING
Add the README.txt.ftp
Bug fixed in importing multibreed reportfile (setting FID properly in output)
0.4.2 (2021-08-27)
Set nullable
ListField
for sample locations and variant consequencesCapitalize phenotype values (ie milk -> Milk)
Import greek chios-mytilini-boutsko sheep dataset
Track multiple location for sample (deal with transhumant breeds )
Import greek skopelios-eghoria goat dataset
Use sample data to deal with multi breeds illumina row files
Determine fid from database with IlluminaReportIO
Import greek frizarta-chios-pelagonia sheep dataset
Import greek frizarta-chios sheep dataset
Import sweden foreground goat dataset
Update ADAPTmap breed names and phenotypes import
Check that breed exists while inserting phenotype data
Import french foreground sheep dataset
Use
elemMatch
in projection inplinkio.SmarterMixin.fetch_coordinates
(ex:VariantSheep.objects.fields(elemMatch__locations={"imported_from": "SNPchiMp v.3", "version": "Oar_v4.0"})
)Use
elemMatch
to search a SNP within the desired coordinate systems inplinkio.SmarterMixin.fetch_coordinates
Skip SNPchimp indels when importing from SNPchimp
Skip illumina indels when reading from manifest
0.4.1 (2021-09-08)
Add
chip_name
in Dataset (database value, not user value)Skip
null
fields when importing datasetsImport uruguay sheep affymetrix data
Import from affymetrix dataset
Rely on original affymetrix coordinate system to determine illumina top alleles
Search samples aliases while importing genotypes
Clearly state when creating samples (ignore samples if not defined in database)
Track sample aliases for
original_id
Import samples from file by providing country and breeds values as parameters
Import sheep coordinates from genome project
Security updates
Fix github Workflow
0.4.0 (2021-06-18)
dbSNP
feature library refactorfix linter issues
Transform affymetrix unmapped chrom to
0
Transform SNPchiMp unmapped chroms to
0
ignore affymetrix insertions and deletions
join affymetrix data with illumina relying on
cust_id
define
illumina_top
from affymetrix flanking sequencesload data from affymetrix manifest
calculate illumina_top from affymetrix sequence
Test import data from snpchimp
Import
OARV4
coordinatesdata/common
module refactoringFix bug in importing dataset order
Model affymetrix fields
Read from affymetrix manifest file
Track illumina manufactured date
0.3.1 (2021-06-11)
Upgrade dependencies
Enable continuous integration
Github Workflow
Coverage
0.3.0 (2021-05-19)
Deal with multi-sheets
.xlsx
documentsImport phenotypes (from a source dataset to a destination dataset)
Define phenotype attribute as a
mongoengine.DynamicDocument
fieldImport metadata or phenotype by breeds or by samples
Import metadata (from a source dataset to a destination dataset)
Forcing
plink
chrom options when converting in binary formatsimport data from ADAPTmap project
Import goat breeds (from a source dataset to a destination dataset)
Import goat data from plink files
Import goat metadata
Import goat data from manifest and snpchimp
configure
mongodb-express
credentialsAdd Goat Related tables
add
variantGoat
collectionadd
sampleGoat
collection
0.2.3 (2021-05-03)
Unset ped columns if relationship can’t be derived from data (ex. brazilian BSI)
Deal with geographical coordinates
Add features to samples (relying on metadata file)
0.2.2 (2021-04-29)
Breed name should be a unique key within species
make rule to clean-up
interim
dataskip already processed file from import
Deal with
mother_id
andfather_id
(search forsmarter_id
in database)Deal with multi-countries dataset
track country in aliases while importing breeds from dataset
0.2.1 (2021-04-22)
Track
chip_name
with samplesDeal with binary plink files
Search breed by aliases used in
dataset
:match fid with breed aliases in
dataset
store aliases by
dataset
Add breeds from
.xlsx
files
0.2.0 (2021-04-15)
Merge multiple files per dataset
Import from an illumina report file
Deal with AB allele coding
Deal with plink text files using modules
Fix SNPchiMp data import
Determine
illumina_top
coding as a property relying on database dataSupport multi-manifest upload (extend database with HD chip)
Deal with compressed manifest
Add breeds with CLI
Check coordinates format relying on DRM
Test stuff with
mongomock
0.1.0 (2021-03-29)
Start with project documentation
Explore background datasets
Merge plink binary files
Convert from
forward
toillumina_top
coordinatesConvert to plink binary format
Manage database credentials
Import samples into
smarter
database while fixing coordinates and genotypesConfigure tox and sphinx environments
Model breeds in
smarter
databaseImport datasets into database
Read from dbSNP xml dump file
Import SNPchiMp data into
smarter
databaseImport Illumina manifest data into database
Model objects with
mongoengine
Model smarter ids
Configure environments, requirements and dependencies