Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for more File Formats (RtMI) #6

Open
JanFrederick00 opened this issue Sep 26, 2022 · 18 comments
Open

Support for more File Formats (RtMI) #6

JanFrederick00 opened this issue Sep 26, 2022 · 18 comments

Comments

@JanFrederick00
Copy link
Collaborator

from Weird.ggpack1a
*.dinky - only one file. Seems to be an uncompiled script. It contains some global defines such as the character's talk colors and some macros.
*.dink - only one file. I assume that this file contains all compiled scripts. I don't know anything about the format, but the individual scripts seem to be delimited by the byte sequence 9C 78 41 34.
*.anim - Skeletal / Animation data. Actually just JSON.
*.attach - These files probably define attachment points on character's skeletons. These are just JSON as well.
*.blend - These are just text file as well. They start with "# DO NOT EDIT! Original is in SVN!".
*.emitter - GGDict file. I used the gg-tool to convert them to JSON.
*.json - Ironically, they are not actually JSON Files but rather GGDict files (which can be converted to JSON just fine). Probably created with the Tool Texturepacker - I think it has a plugin interface to write to custom data formats. If that is the case then these files mark which parts of the image files are separate textures.
*.lip - Probably lip sync data. just Text. Each line starts with a timestamp. The letters probably correspond to the character's mouth shapes.
*.otf, *.ttf - fonts
*.txt, *.tsv - Localized string data. Credits, LeDiary, VersionGame, CollectableData, Text
(the collectable data would be very useful if i hadn't already answered all of them correctly on my own... - the correct answers are prefixed by a *)
*.wimpy - I think these contain the actual locations. They are GGDict files and can be converted to json - see the note below. The "gg"-tool is almost able to parse them at the moment, I might have to make another pull request there.
*.yack - I think these are different than TWP's .yak files - the tool from the other repository doesn't work.

Note that they changed the format of the GGDicts from twp to rtmi - the string / key indices are now just 16 bits wide.
When converting the files the tool may complain about unknown data types - so far I have seen type 9, 10 and 11.
judging by the data (they are all just saved as strings), 9 is a 2D Vector, 10 is a structure comprised of 2 2D vectors, and 11 is a structure of 3 2d vectors.

from Weird.ggpack5*:
*.atlas - Just text. These seem to reference the sub-images from the .json-files. and control the Color format and filter modes. (as well as some sizing and rotation information)

I think the most interesting files - after the graphics / audio are the dink scripts (even though the comments have been removed during compilation) and the yak files. Most of the other ones are either Text, JSON or GGDict's (which could be converted to JSON for preview.)

If the Thimbleweed Park explorer implemented a conversion process from GGDICT<->JSON analogous to the gg-tool then the pack parsing could be further improved.

@JanFrederick00
Copy link
Collaborator Author

grafik

@JanFrederick00
Copy link
Collaborator Author

grafik
A little preview. I implemented the parsing of the GGDict files and conversion to JSON. This affects the .json, .wimpy and .emitter files.
The parsing of the packs themselves is also handled by this class now.

I'll work on it some more before committing it though, a "save as json" option would be nice to have.

Also, If a file from RtMI is opened, .json files are no longer treated as text.

@comatron
Copy link

i wanted to extract all music and sfx from rtmi - tried with the 0.4 version and had no luck. any updates on a build of this enhancement? i am no coder :(

@bgbennyboy
Copy link
Owner

i wanted to extract all music and sfx from rtmi - tried with the 0.4 version and had no luck. any updates on a build of this enhancement? i am no coder :(

It will come, we've done it successfully manually. I just need to code it so its done automatically. Hopefully this weekend.

@JanFrederick00
Copy link
Collaborator Author

There is a third key.
It's 1024 bytes long and used to decrypt the .yack files (AFTER they have already been decrypted.)
The only problem is: the algorithm takes a parameter by which the index into the key is offset. I don't know where this key is coming from.
By experimenting I have figured out that some files (I found Carla.yack and Chums.yack) seem to use the offset 5.
I'm currently trying to find a way to brute-force this.

@JanFrederick00
Copy link
Collaborator Author

The first byte in the decoded files seems to be 0x00. Starting the key at the first key-byte that equals the first data byte seems to work. The largest offset I've seen is 20.
I don't know anything about the decoded files' format, so i can't say whether this format has changed as well.

@jonsth131
Copy link

There is a third key. It's 1024 bytes long and used to decrypt the .yack files (AFTER they have already been decrypted.) The only problem is: the algorithm takes a parameter by which the index into the key is offset. I don't know where this key is coming from. By experimenting I have figured out that some files (I found Carla.yack and Chums.yack) seem to use the offset 5. I'm currently trying to find a way to brute-force this.

The index for the yack decryption is just the file name length minus the file extension. I.e Carla.yack has the index 5. This has worked for all yack files I've extracted so far.

@JanFrederick00
Copy link
Collaborator Author

There is a third key. It's 1024 bytes long and used to decrypt the .yack files (AFTER they have already been decrypted.) The only problem is: the algorithm takes a parameter by which the index into the key is offset. I don't know where this key is coming from. By experimenting I have figured out that some files (I found Carla.yack and Chums.yack) seem to use the offset 5. I'm currently trying to find a way to brute-force this.

The index for the yack decryption is just the file name length minus the file extension. I.e Carla.yack has the index 5. This has worked for all yack files I've extracted so far.

Good catch.

@JanFrederick00
Copy link
Collaborator Author

I think I understand the structure of the new .yack files now.
I'm not sure whether that format has been used before or if someone else has already figured it out.
The (decrypted) files start with the sequence 00 78 E6 DC.
The next uint points to the start of the string section (seek to this offset, skip 4, read number of strings as uint, then read n null-terminated strings.)

Then, beginning at offset 8 follows a series of "lines"
Each line contains as follows:

  • a byte denoting the type of line.
  • uint: I have no idea what this means, only that it seems to stay the same or increment from line to line. Maybe it's just the line number in the original source-code for debugging purposes
  • uint: No idea. In the file I looked at it was always 0
  • byte: I've only seen it as 0 or 1 - it seems to mark how many extra arguments the line has.
    the actual number of arguments is 2 + number of extra arguments. If a line does not need 2 parameters, the extra one is 0xFFFFFFFF
  • Parameters: each parameter is probably a uint.

Most opcodes seem to have one or two parameters. If a third one is added, it seems to make the line conditional. The condition is the first parameter in this case.

Opcodes I have figured out (I think:)

  • 0x09: 1 parameter - seems to be a label.
  • 0x01: 2 or 3 parameters: make a character say something. Parameter 1: character, 2: line
  • 0x0A: 1 or 2 parameters: goTo or conditional goTo
  • 0x08: 1 parameter: Seems to "evaluate" the argument - for example the argument might be "spoilerAlert()" to show the "SPOILER ALERT"-message when talking to the curator in the museum.

I'm not so sure about these:

  • 0x64: Seems to describe a dialog choice - 2 parameters: line and the label to go to when this choice is selected.
  • 0x13: Maybe "gosub"? (call code and then return back here)? I'm really not sure. The code called by this seems to go to the label "done" which exists at the end of the file with an opcode of 03 below it and then exiting.

I'm really not familiar with the format of the yack files used in TWP - maybe some of this is explained by them.
There are some others like 0x07 which seems to take an actor as an argument, 0x0C with none and 0x10 which may set up the number of available dialog choices (???).

These "conditions" aren't always how I would expect them either.
I would expect them all to be a comparison operation like
Museum.eyepatch.state == "gone"

But sometimes they contain the name of a file:
?/Users/rong/ggcode/weirdengine/_CompiledCode/Curator.yack12:
(interestingly his internal name for the engine seems to be "weirdengine")

perplexing.
Maybe someone else has already figured it out and I'm making a fool of myself.

@r1sc
Copy link

r1sc commented Sep 28, 2022

I figured out a bunch of stuff. I can now dump yack-files to text. Look here https://github.com/jonsth131/ggtool/blob/main/libdinky/src/yack.rs

I'm pretty confident with these opcode definitions:

enum YackOpcode {
End = 0,
ActorSay = 1,
EmitCode = 8,
DefineLabel = 9,
GotoLabel = 10,
ChooseReply = 100,
Unknown
}

@bgbennyboy
Copy link
Owner

Fantastic work yet again!

@JanFrederick00
Copy link
Collaborator Author

JanFrederick00 commented Sep 29, 2022

I'm looking into the .dink file.
The file contains multiple blocks, each starting with the "magic number" 0x3441789C. The next uint describes the length of the block. using this information, the blocks can be separated.

Each block seems to describe a function in a script. Weird.dink contains 6478 of these blocks. It is comprised of multiple blocks and starts with the magic number 0x7F46a125. The header additionally holds 10 more bytes of unknown purpose.
Each block again starts with a 4 byte magic number and length of the block (excluding the last 8 bytes).

Here's what I found out:
Block with the ID 0x16F94B62 contains 3 0-terminated strings and two 32-bit integers.
The first string seems to be some sort of UID for the function. The second gives the name of the function and the third the name of the script file (i.e. "Boot.dinky")

Block with ID 0x983f1cfa:
Holds a series of null terminated strings.

Each function contains four more blocks:

  • FD4BC33A: Starts with 04 02... and seems to contain these two bytes at (semi-)regular intervals. Data seems to be organized in 8-byte segments. 04 02 seem to be the most common first two bytes (opcodes?), at least in the first few scripts.
  • 55ED4D1D: Unknown but length is always a multiple of 4.
  • 62D34042: Unknown - organized in 8 bytes?
  • 470DA31C: Length always 0 - seems to mark the end of the function.

@JanFrederick00
Copy link
Collaborator Author

I have started writing a tool to view the .wimpy files.

@JanFrederick00
Copy link
Collaborator Author

.dink file:

  • FD4BC33A: Seems to contain definitions of constants. each 8 bytes define a constant (maybe variable?) -> uint type, uint value.
    there seem to be three different types: 0x102: int32, 0x103: float32, 0x204: string - the value describes an offset into the string section (block with id 0x983f1cfa)

  • 55ED4D1D: Probably the instructions themselves. organized in 4 bytes each.
    The first byte seems to be the opcode. 0x33 seems to be "return" as each function ends with it.

  • 62D34042: Triplets of ints (unknown, firstInstruction, lastInstruction). My theory is that this divides the instructions into segments. Maybe these are the line numbers from the source file for debugging purposes?
    The first number seems to be generally unrelated to the instructions in that block.

I have no idea what the opcodes are, or how the constants are referenced in the byte code.

@JanFrederick00
Copy link
Collaborator Author

The lowest 11 bits of each instruction are the op-code.
Op-codes 0x00 and 0x36 are treated as no-op
0x33 returns (int)(instruction >> 23)
0x07 seems to "focus" on a variable
0x14 seems to be a member access / array index access of some sort
0x17 and 0x18 both call into native functions (breakWhileRunning etc...)

@JanFrederick00
Copy link
Collaborator Author

JanFrederick00 commented Oct 1, 2022

Thankfully, the game contains strings for all of the opcodes:

0x01 = OP_PUSH_CONST
0x02 = OP_PUSH_NULL
0x03 = OP_PUSH_LOCAL
0x04 = OP_PUSH_UPVAR
0x05 = OP_PUSH_GLOBAL
0x06 = OP_PUSH_FUNCTION
0x07 = OP_PUSH_VAR
0x08 = OP_PUSH_GLOBALREF
0x09 = OP_PUSH_LOCALREF
0x0A = OP_PUSH_UPVARREF
0x0B = OP_PUSH_VARREF
0x0C = OP_PUSH_INDEXREF
0x0D = OP_DUP_TOP
0x0E = OP_UNOT
0x0F = OP_UMINUS
0x10 = OP_UONECOMP
0x11 = OP_MATH
0x12 = OP_LAND
0x13 = OP_LOR
0x14 = OP_INDEX
0x15 = OP_ITERATE
0x16 = OP_ITERATEKV
0x17 = OP_CALL
0x18 = OP_FCALL
0x19 = OP_CALLINDEXED
0x1A = OP_CALL_NATIVE
0x1B = OP_FCALL_NATIVE
0x1C = OP_POP
0x1D = OP_STORE_LOCAL
0x1E = OP_STORE_UPVAR
0x1F = OP_STORE_ROOT
0x20 = OP_STORE_VAR
0x21 = OP_STORE_INDEXED
0x22 = OP_SET_LOCAL
0x23 = OP_NULL_LOCAL
0x24 = OP_MATH_REF
0x25 = OP_INC_REF
0x26 = OP_DEC_REF
0x27 = OP_ADD_LOCAL
0x28 = OP_JUMP
0x29 = OP_JUMP_TRUE
0x2A = OP_JUMP_FALSE
0x2B = OP_JUMP_TOPTRUE
0x2C = OP_JUMP_TOPFALSE
0x2D = OP_TERNARY
0x2E = OP_NEW_TABLE
0x2F = OP_NEW_ARRAY
0x30 = OP_NEW_SLOT
0x31 = OP_NEW_THIS_SLOT
0x32 = OP_DELETE_SLOT
0x33 = OP_RETURN
0x34 = OP_CLONE
0x35 = OP_BREAKPOINT
0x36 = OP_REMOVED
0x37 = __OP_LAST__
0x38 = _OP_LABEL_

@james-aslett
Copy link

james-aslett commented Oct 2, 2022

i wanted to extract all music and sfx from rtmi - tried with the 0.4 version and had no luck. any updates on a build of this enhancement? i am no coder :(

It will come, we've done it successfully manually. I just need to code it so its done automatically. Hopefully this weekend.

Would be epic if you could confirm once we are successfully able to get hold of the music. I managed to get as far as determining that ggpack4d contains the music, and extracted Music.assets.bank from it, but none of the bank extraction applications seem to work - 0 files found :(

@bgbennyboy
Copy link
Owner

i wanted to extract all music and sfx from rtmi - tried with the 0.4 version and had no luck. any updates on a build of this enhancement? i am no coder :(

It will come, we've done it successfully manually. I just need to code it so its done automatically. Hopefully this weekend.

Would be epic if you could confirm once we are successfully able to get hold of the music. I managed to get as far as determining that ggpack4d contains the music, and extracted Music.assets.bank from it, but none of the bank extraction applications seem to work - 0 files found :(

Working on it now, I found a better way of doing it, so re-writing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants