src.features.utils
Created on Mon Mar 15 14:13:51 2021
@author: Paolo Cozzi <paolo.cozzi@ibba.cnr.it>
- class src.features.utils.TqdmToLogger(logger, level=None)[source]
Bases:
StringIO
Output stream for TQDM which will output to logger module instead of the StdOut.
- buf = ''
- flush()[source]
Flush write buffers, if applicable.
This is not implemented for read-only and non-blocking streams.
- level = None
- logger = None
- src.features.utils.find_duplicates(header: list) list [source]
Find duplicate columns in list. Returns index to remove after the first occurence
- src.features.utils.get_interim_dir() PosixPath [source]
Return smarter data temporary dir
- Returns
the smarter data temporary dir
- Return type
- src.features.utils.get_processed_dir() PosixPath [source]
Return smarter data processed dir (final processed data)
- Returns
the smarter data final processed dir
- Return type
- src.features.utils.get_project_dir() PosixPath [source]
Return smarter project dir (which are three levels upper from the module in which this function is stored)
- Returns
the smarter project base dir
- Return type
- src.features.utils.get_raw_dir() PosixPath [source]
Return smarter data raw dir
- Returns
the smarter data raw directory
- Return type
- src.features.utils.sanitize(word: str, chars=['.', ',', '-', '/', '#'], check_mongoengine=True) str [source]
Sanitize a word by removing unwanted characters and lowercase it.
- src.features.utils.skip_comments(handle: ~_io.TextIOWrapper, comment_char='#') -> (<class 'int'>, <class 'list'>)[source]
Ignore comments lines from a open file handle. Return the stream position immediately after the comments and all the comment lines in a list.
- Parameters
handle (io.TextIOWrapper) – An open file handle.
comment_char (TYPE, optional) – The comment character used in file. The default is “#”.
- Returns
The stream position after the comments and the ignored lines as a list.
- Return type