forked from nltk/nltk
-
Notifications
You must be signed in to change notification settings - Fork 1
API changes for Python 3
alexrudnick edited this page Jan 3, 2013
·
16 revisions
- methods that return sequences should produce iterators by default; there would be no special iter methods
- should we remove the
..._itermethods completely, or just deprecate them?
- should we remove the
- make Tree(s) synonymous with Tree(s,[]) and use Tree.parse(s) directly
- this would simplify the code in tree.py a lot! I'm all for, --Peter Ljunglöf
- but the name
parseis unfortunate -- it reminds too much of all the NLTK parsers -- how aboutfromstring? (used in the libraries array, xml.etree, lxml, numpy, ...)
- The same argument could be made for nltk.align.Alignment, --Peter Ljunglöf
-
__new__is used to be able to give a Giza string instead of a list of pairs. - Suggestion: add classmethod
Alignment.fromGizaand let the constructor only allow a list of pairs.
-
- not use
__new__in Abstract Base Classes, for propagating the constructor to subclasses --Peter Ljunglöf- example:
FeatStruct(x)returns a FeatDict or a FeatList, depending onx. This means thattype(T(x)) != Tfor some classes T, which is unintuitive (and un-object-oriented). - Suggestion: the same as above -- use
FeatStruct.parse(s)or something like that - This also holds for nltk.sourcedstring.SourcedString, I think
- I'm not sure if it will work for nltk.util.AbstractLazySequence
- example:
- remove/deprecate nltk.misc.babelfish?
- we should go ahead and remove it, yeah. https://github.com/nltk/nltk/issues/265 --Alex Rudnick
- perhaps this could be used to simplify sem/logic.py?
-
ConditionalFreqDist.conditions()currently returns a sorted list, which is inefficient:- Suggestion: Just let it return
.keys()without sorting.
- Suggestion: Just let it return
- we may need to wrap word_tokenize() in sent_tokenize(), since some users (and the book?) apply word_tokenize to un-sentence-segmented text
-
Treeshould not be a subclass oflist--Peter Ljunglöf- Almost all list-operations are anyway unsupported on trees. E.g.,
+or*are not supported, but+=is.
- Almost all list-operations are anyway unsupported on trees. E.g.,