Conversion of OOXML to CiceroMark and vice versa #251

algomaster99 · 2020-07-09T16:26:49Z

Issue #247

Add package for converting OOXML <-> CiceroMark

Changes

initialize package
add function for converting OOXML to CiceroMark with proper tests
add function for converting CiceroMark to OOXML with proper tests

Signed-off-by: Aman Sharma <[email protected]>

jeromesimeon

Why is this a different package from docx ?

algomaster99 · 2020-07-09T16:38:17Z

Oh, I just thought it was supposed to be a different package. One reason I can think of is that markdown-docx uses a path to a DOCX file which I think will install dependencies which might only work on Node and not on the browser. Mammoth works on the web too though.

jeromesimeon · 2020-07-09T16:41:29Z

Oh, I just thought it was supposed to be a different package. One reason I can think of is that markdown-docx uses a path to a DOCX file which I think will install dependencies which might only work on Node and not on the browser. Mammoth works on the web too though.

I think it would make sense for all transforms to work on input data in memory and not depend on the file system? Maybe I'm missing something there.

Also, it would help to figure out if there are any differences between ooxml or docx before creating a whole new package.

@DianaLease @irmerk @dselman help here please?

dselman · 2020-07-09T16:45:17Z

Why is this a different package from docx ?

I think making this OOXML is more descriptive - as technically that is the (open) data format, vs docx which is the Microsoft proprietary file wrapper.

algomaster99 · 2020-07-09T16:46:38Z

@dselman But we will only be able to transform OOXML from Word Documents to CiceroMark. Other OOXML (for eg. from PowerPoint, Excel, etc.) will fail as it makes no sense.

jeromesimeon · 2020-07-09T16:47:58Z

Why is this a different package from docx ?

I think making this OOXML is more descriptive - as technically that is the (open) data format, vs docx which is the Microsoft proprietary file wrapper.

Sounds like an argument for renaming the current package, not creating a new one?

Also, maybe I'm wrong but isn't a docx containing several xml files? (some of which may or may not be ooxml?).

algomaster99 · 2020-07-09T16:51:44Z

Also, maybe I'm wrong but isn't a docx containing several xml files?

@jeromesimeon OOXML file is just a zipped collection of all the XML files the Word contains as written here.

jolanglinais · 2020-07-09T17:21:33Z

After discussing on a call, I think it would likely make the most sense to keep the code in markdown-docx instead of making a new package. In the future we can change the name if that makes sense... But based on @algomaster99's findings it seems like docx is the most descriptive at the moment.

This will get rid of mammoth and replace the current functionality with the implementation Aman creates.

Signed-off-by: Aman Sharma <[email protected]>

jolanglinais · 2020-07-13T19:08:43Z

@algomaster99 I think we wanted to keep it markdown-docx instead of markdown-ooxml

algomaster99 · 2020-07-13T19:20:45Z

@irmerk cool, I will try to integrate it there. Once I will do that, I will close this PR.

jolanglinais

One thought I had going through the code:

jolanglinais · 2020-07-13T19:14:18Z

packages/markdown-ooxml/src/OoxmlTransformer.js

+    getId(elements) {
+        const variableProperties = elements[0]
+        for (const property of variableProperties.elements) {
+            if (property.name === 'w:tag') {
+                return property.attributes['w:val'];
+            }
+        }
+    }


Is there a case where getId() will not return anything, and you'd want to have a base case after the for loop?

@irmerk I think every variable will have a corresponding ID so it ought to return something. And I am only calling this function when I parse w:sdt which are node types for content controls.

feat(markdown-ooxml): initialize the markdown-ooxml package

328abad

Signed-off-by: Aman Sharma <[email protected]>

algomaster99 force-pushed the markdown-ooxml branch from 58f46d9 to 328abad Compare July 9, 2020 16:27

jeromesimeon reviewed Jul 9, 2020

View reviewed changes

Add parser for OOXML -> CiceroMark transformation

d0e8ea0

Signed-off-by: Aman Sharma <[email protected]>

jolanglinais assigned algomaster99 Jul 13, 2020

jolanglinais requested review from jolanglinais, DianaLease and jeromesimeon July 13, 2020 18:12

jolanglinais reviewed Jul 13, 2020

View reviewed changes

algomaster99 mentioned this pull request Jul 14, 2020

Add parser for OOXML transformation to CiceroMark #261

Closed

2 tasks

algomaster99 closed this Jul 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversion of OOXML to CiceroMark and vice versa #251

Conversion of OOXML to CiceroMark and vice versa #251

algomaster99 commented Jul 9, 2020

jeromesimeon left a comment

algomaster99 commented Jul 9, 2020

jeromesimeon commented Jul 9, 2020

dselman commented Jul 9, 2020

algomaster99 commented Jul 9, 2020

jeromesimeon commented Jul 9, 2020

algomaster99 commented Jul 9, 2020

jolanglinais commented Jul 9, 2020

jolanglinais commented Jul 13, 2020

algomaster99 commented Jul 13, 2020

jolanglinais left a comment

jolanglinais Jul 13, 2020

algomaster99 Jul 13, 2020

Conversion of OOXML to CiceroMark and vice versa #251

Conversion of OOXML to CiceroMark and vice versa #251

Conversation

algomaster99 commented Jul 9, 2020

Issue #247

Changes

jeromesimeon left a comment

Choose a reason for hiding this comment

algomaster99 commented Jul 9, 2020

jeromesimeon commented Jul 9, 2020

dselman commented Jul 9, 2020

algomaster99 commented Jul 9, 2020

jeromesimeon commented Jul 9, 2020

algomaster99 commented Jul 9, 2020

jolanglinais commented Jul 9, 2020

jolanglinais commented Jul 13, 2020

algomaster99 commented Jul 13, 2020

jolanglinais left a comment

Choose a reason for hiding this comment

jolanglinais Jul 13, 2020

Choose a reason for hiding this comment

algomaster99 Jul 13, 2020

Choose a reason for hiding this comment