Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable add accented words #46

Open
Chavao opened this issue Sep 20, 2023 · 13 comments
Open

Unable add accented words #46

Chavao opened this issue Sep 20, 2023 · 13 comments
Assignees

Comments

@Chavao
Copy link
Contributor

Chavao commented Sep 20, 2023

Describe the bug
I tried to add words with accents, I'm Brazilian so we have a lot of words with accents.
When I press the acute accent and the following letter, the letter disappears.

E.g.

I wrote fiancé

image

image

To Reproduce
Steps to reproduce the behavior:

  1. Press a to add a new Task
  2. Write an accented word

Expected behavior
Appear the accented word, like fiancé

Software configuration (please complete the following information):

  • OS: Ubuntu 22.04.3 LT
  • Terminal emulator: Tilix running zsh
  • Version: latest
@Chavao
Copy link
Contributor Author

Chavao commented Sep 20, 2023

If you have any tips on how to solve it, I could open a PR to fix it.
I didn't do it yet, because I'm not a C++ programmer, so it's hard to figure out the path to solve it.

@stweise
Copy link
Collaborator

stweise commented Sep 21, 2023

hi @Chavao thank you so much for the input.
I am looking into options right away.

@stweise
Copy link
Collaborator

stweise commented Sep 29, 2023

Support for wide characters if full of pitfalls it turns out.
Finding these is hard and I have therefore create a minimal testing application to verify the basics before I go into much more detail in code.

I also simplified the build process and would like to verify that on different platforms.

please check out https://github.com/gtaubman/doneyet/tree/wide_char_support
build using cmake (mkdir build; cd build; cmake ..; make; )
and run wide_input_test (./wide_input_test) and let me know if you are able to add accented characters

this is what it looks like for me on some input:
Bildschirmfoto 2023-09-29 um 12 34 50

let me know if it works for you too and please give me some details about your platform (OS, Version)

This test: Macbook Air M2, macOS 13.5.2 (22G91)

@stweise
Copy link
Collaborator

stweise commented Sep 29, 2023

Also works on Xubuntu (Ubuntu with XFCE) 22.04.02 LTS

@stweise
Copy link
Collaborator

stweise commented Sep 29, 2023

hmm seems I had ncursesw already installed on those machines, testing on a plain machine right now

Kubuntu (Ubuntu with KDE) 22.04.03 LTS

needed to install these apt install cmake pkg-config libncursesw5-dev libncurses-dev

works as well

@Chavao
Copy link
Contributor Author

Chavao commented Sep 29, 2023

I tried on Lubuntu 22.04.3 LTS using zsh on Tilix and it worked fine

image

@stweise
Copy link
Collaborator

stweise commented Sep 29, 2023

awesome, then I know it works for you and me :-)

I am working on cleaning the code right now, something I have been pushing off for years. This should simplify the task of adding, tracking and debugging the changes.

I will focus in the serialization first, as I fear that this is where all of the extension will break.
for reference see

void Task::Serialize(Serializer* s) and
void Serializer::WriteString(string str)

currently tasks are stored by writing chars which are 8 bit -> 1 byte
but our new characters are wchar_t, which is 16 bit or 64 bit > 2 or 4 byte depending on the compiler used

@gtaubman
Copy link
Owner

gtaubman commented Sep 29, 2023 via email

@stweise
Copy link
Collaborator

stweise commented Oct 5, 2023

I reworked the serializer to support the wide characters.
Created a little test application if you like to test too

./wide_serializer_test
takes wide input and writes it together with a lengthy test string into a file called bla
Bildschirmfoto 2023-10-05 um 08 38 49

hex dump shows correct width information & printable ascii chars.
Bildschirmfoto 2023-10-05 um 08 39 28

Now I will adept most of the other classes handle any kind of string.

@stweise
Copy link
Collaborator

stweise commented Oct 11, 2023

Support for wide characters is well underway.
I am ready to set up a basic new project now, even using emoji

starting ./wide_serializer_test
given this input
Bildschirmfoto 2023-10-11 um 22 03 15

and this rough code structure

 Project *p =  new Project(input);
 Serializer* s = new Serializer("", "./bla");
 s->SetVersion(NOTES_VERSION);
 p->AddTaskNamed(L"new task meet with fiancé");
 p->AddTaskNamed(L"cool new öäü");

we already have this binary representation now
Bildschirmfoto 2023-10-11 um 22 03 40

i also already read this back and the wide strings match perfectly.

Next up: make sure all the ncurses drawing still works.

@stweise
Copy link
Collaborator

stweise commented Oct 31, 2023

Wanna give a little heads-up.

I am still working on this.
The way that wide char support is handled in ncurses forms (form.h) is really unclear to me. There are no additional functions offered and the underlying implementation seems to just accept rendered UTF-8 in char *.
I am going to create a simple forms application and verify this behavior. Doing so in the main application is way too tedious.

@stweise
Copy link
Collaborator

stweise commented Feb 7, 2024

I am happy to report that I was able to build a version that works with UTF-8. Please see attached screenshot.

The development branch is located at https://github.com/gtaubman/doneyet/tree/without_wide_char

⚠️ Warning: I have cleaned up the file format for doneyet and made several cleanup changes, don't be overwhelmed by the amount of change you see there.

Still todo:

  • provide a safe way for contributors to test (file format breaking change)
  • provide a safe transition for the file format change
  • test extensively on Linux & Mac (@gtaubman did you ever really support this for Windows?)
  • document the input methods (utf-8 on Linux \u+008 - Ctrl + Shift + U)
  • refactor change into packets someone can review/understand if needed
  • Use LC_TIME instead of what i set via LC_ALL to display date
Bildschirmfoto 2024-02-07 um 20 17 50

@stweise stweise self-assigned this Feb 7, 2024
@gtaubman
Copy link
Owner

gtaubman commented Feb 8, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

3 participants