Skip to content

Update FASTTEntity management #102

@jecisc

Description

@jecisc

We have a trait FASTTEntity and it seems that all FAST entities should use this trait but it does not help for extensibility.

The only things this trait is bringing is to save the start pos and end pos of the source code of a node as a number of characters. But this brings a limitation.

In TreeSitter we have the positions in number of bytes. And converting the number of bytes in number of character is really costly.
I have an importer that just instantiate one FAST entity for each TreeSitter nodes without any additional work and the vast majority of the time spent is in the conversion of the positions from byte to characters.

I parsed a file of almost 900 lines of code and it took 6.5sec. If I use the positions in bytes the import time drop at 100ms.

Since we have more and more FAST implementation based on TreeSitter, if we want to have something efficient, we need to save the positions in bytes instead of the number of characters. This would also make the reading of the source text faster.

What I propose is:

  • Deprecate FASTTEntity (We do not have a FamixTEntity, so why have one in FAST ?)
  • Introduce FASTTCharacterRelativeSource that would be the current FASTTEntity
  • Introduce FASTTByteRelativeSource that would be an alternative implementation based on the number of bytes.

This would allow TreeSitter to use the FASTTByteRelativeSource and be much much much faster to import models.

Related issue: Evref-BL/Pharo-Tree-Sitter#32

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions