Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tab annotations #1175

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Tab annotations #1175

wants to merge 4 commits into from

Conversation

duncanka
Copy link
Contributor

Previously, attempting to make an annotation spanning newlines with multiple fragments would behave unreliably, and sometimes cause data corruption. This is now fixed. Additionally, I extended the same discontinuous span technique used to solve #786 to solve #819, adding support for annotations spanning tabs. (Arguably, proper escaping would still be a more elegant solution.)

Unresolved issue: when virtual fragments are created to handle line splits and tabs, deleting/moving a virtual fragment deletes only that one small chunk. It does not delete anything beyond the nearest newline or tab, which may mean part of the larger fragment the user meant to delete still remains.

(Apologies for all the distracting whitespace deletions from editing in Emacs.)

@benolayinka
Copy link

@duncanka I'm having a similar nightmare with this. I'm running an annotation server which, for whatever reason, often includes newlines in annotations, and they crash brat completely. Your solution for splitting newlines in to separate annotations mostly works, but breaks if the annotation is only newlines (i.e "\n\n")

I also tried digging around to figure out where the annotations are written to the file, and simply escape it, but I still haven't grasped it completely. What was your reason for avoiding this approach?

@duncanka
Copy link
Contributor Author

@benolayinka I thought it was impossible to annotate an all-whitespace span anyway—i.e., the UI won't even allow it. I know zero-width spans were implemented a while back, but as far as I can tell the feature is basically non-functional at this point. What's your use case that requires all-newline annotations?

It's been a while since I wrote this patch or otherwise tinkered with brat, but I think the main reason I took this approach was simply that it was easier. If you do want to try implementing escaping, I believe the code you want to modify is in annotation.py (particularly TextBoundAnnotationWithText.__str__) and annotator.py (particularly _create_span). But part of my fear with escaping was that there are other corners of the code that unexpectedly rely on the underlying text exactly matching the error-checking text in the .ann files.

@benolayinka
Copy link

benolayinka commented Oct 30, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants