Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epic | Model improvements 2023-02-15 #74

Closed
apmt opened this issue Feb 14, 2023 · 0 comments · Fixed by #81
Closed

Epic | Model improvements 2023-02-15 #74

apmt opened this issue Feb 14, 2023 · 0 comments · Fixed by #81
Assignees
Labels
data preprocessing consists of all changing, cleaning and validating the data before running the model feature-engineering
Milestone

Comments

@apmt
Copy link
Contributor

apmt commented Feb 14, 2023

Description

This story was created to describe the new models adjustments for the 2023-02-05 delivery

User story

Shannon Entropy

Who When Then
A tech support need to add Shannon Entropy to KeySmash Features in order to validate if the model performs better

Tasks

Bigrams Sequence Feature

Who When Then
A tech support need to add Bigrams Sequence Feature to KeySmash Features in order to validate if the model performs better

Tasks

Fix Y as not consonant in consonant sequence KeySmash feature

Who When Then
A tech support need to add Fix Y as not consonant in consonant sequence KeySmash feature in order to validate if the model performs better

Tasks

Get context states abbreviations into context abbreviations file

Who When Then
A tech support need to get context states abbreviations into context abbreviations file in order to validate if the model performs better

Tasks

Unique Characters

Who When Then
A tech support need to add Unique Characters to KeySmash Features in order validate if the model performs better

Tasks

This was referenced Feb 14, 2023
@apmt apmt self-assigned this Feb 14, 2023
@apmt apmt added feature-engineering data preprocessing consists of all changing, cleaning and validating the data before running the model labels Feb 14, 2023
@apmt apmt added this to the Sprint 5 milestone Feb 14, 2023
@apmt apmt changed the title Epic | Model adjustments 2023-02-15 Epic | Model improvements 2023-02-15 Feb 15, 2023
@apmt apmt linked a pull request Feb 16, 2023 that will close this issue
1 task
apmt pushed a commit that referenced this issue Feb 17, 2023
apmt pushed a commit that referenced this issue Feb 17, 2023
apmt pushed a commit that referenced this issue Feb 17, 2023
@apmt apmt closed this as completed in #81 Feb 17, 2023
apmt added a commit that referenced this issue Feb 17, 2023
* #76 remove 'y' from consonant sequences feature

* #77 add all Mexico states abbreviations and its source in the docstring

* #73 implement shannon entropy method and adapt the threshold calculation to also match values below

* #73 new model with shannon entropy and notebook sets

* #73 fix baja california abbreviation

* #73 fix keysmash sequence test to also consider special characters

* #74 fix private methods naming convention to double underscores

* #75 and #78 add KeySmash features: repeated bigrams and unique chars ratios

* #74 fix tests and parser private methods

* #74 add model test

* #74 update initial sets models

---------

Co-authored-by: atarchetti <[email protected]>
apmt pushed a commit that referenced this issue Feb 23, 2023
apmt pushed a commit that referenced this issue Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data preprocessing consists of all changing, cleaning and validating the data before running the model feature-engineering
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant