Add final thoughts for Tagalog on draft mode

ljvmiranda921 · Jan 17, 2023 · 5634a55 · 5634a55
1 parent b57c0d9
commit 5634a55
Show file tree

Hide file tree

Showing 2 changed files with 38 additions and 39 deletions.
diff --git a/_drafts/tagalog.md b/_drafts/tagalog.md
@@ -0,0 +1,38 @@
+<!--
+In Tagalog, we have this word called diskarte. There is no direct translation in
+English, but I can describe it loosely as resourcefulness and creativity. It's
+not a highly-cognitive trait: smart people may be bookish, but not madiskarte.
+It's more practical, a form of street smarts, even. Diskarte is a
+highly-Filipino trait, borne from our need to solve things creatively in the
+presence of constraints. I mention this because working in Tagalog, or any
+low-resource language, requires a little diskarte, and I enjoy it! 
+
+There are many exciting ways to tackle Tagalog NLP. Right now, I'm taking the
+standard labeling, training, and evaluation approach. However, I'm interested in
+exploring model-based techniques like cross-lingual transfer learning and
+multilingual NLP to "get around" the data bottleneck. After three months (twelve
+weekends, to be specific) of labeling, I realized how long and costly the
+process was. I still believe in getting gold-standard annotations, but I also
+want to balance this approach with short-term solutions. 
+
+I wish we had more consolidated efforts to work on Tagalog NLP. Right now, I
+noticed that research progress for each institution is disconnected from one
+another. I definitely like what's happening in
+[Masakhane](https://www.masakhane.io/) for African languages and
+[IndoNLP](https://indonlp.github.io/) for Indonesian. I think they are good
+community models to follow. In the future, wouldn't it be great if [Komisyon sa
+Wikang Filipino](https://kwf.gov.ph/) had a dedicated computational linguistics
+group? Tagalog is not the only language in the Philippines, and being able to
+solve one Filipino language at a time would be nice.
+
+Right now, I'm working on
+[calamanCy](https://github.com/ljvmiranda921/calamanCy), my attempt to create
+spaCy pipelines for Tagalog. Its name is based on kalamansi, a citrus fruit
+common in the Philippines. Unfortunately, it's something that I've been working
+on in my spare time, so progress is slower than usual! This blog post contains
+my experiments on building the NER part of the pipeline. I plan to add a
+dependency parser and POS tagger from Universal Dependencies in the future.
+
+That's all for now. Feel free to hit me up if you have any questions and want to
+collaborate! Maraming salamat!
+-->
diff --git a/notebook/_posts/2023-02-04-tagalog-pipeline.md b/notebook/_posts/2023-02-04-tagalog-pipeline.md
@@ -675,45 +675,6 @@ annotating.
 
 TODO
 
-<!--
-In Tagalog, we have this word called diskarte. There is no direct translation in
-English, but I can describe it loosely as resourcefulness and creativity. It's
-not a highly-cognitive trait: smart people may be bookish, but not madiskarte.
-It's more practical, a form of street smarts, even. Diskarte is a
-highly-Filipino trait, borne from our need to solve things creatively in the
-presence of constraints. I mention this because working in Tagalog, or any
-low-resource language, requires a little diskarte, and I enjoy it! 
-
-There are many exciting ways to tackle Tagalog NLP. Right now, I'm taking the
-standard labeling, training, and evaluation approach. However, I'm interested in
-exploring model-based techniques like cross-lingual transfer learning and
-multilingual NLP to "get around" the data bottleneck. After three months (twelve
-weekends, to be specific) of labeling, I realized how long and costly the
-process was. I still believe in getting gold-standard annotations, but I also
-want to balance this approach with short-term solutions. 
-
-I wish we had more consolidated efforts to work on Tagalog NLP. Right now, I
-noticed that research progress for each institution is disconnected from one
-another. I definitely like what's happening in
-[Masakhane](https://www.masakhane.io/) for African languages and
-[IndoNLP](https://indonlp.github.io/) for Indonesian. I think they are good
-community models to follow. In the future, wouldn't it be great if [Komisyon sa
-Wikang Filipino](https://kwf.gov.ph/) had a dedicated computational linguistics
-group? Tagalog is not the only language in the Philippines, and being able to
-solve one Filipino language at a time would be nice.
-
-Right now, I'm working on
-[calamanCy](https://github.com/ljvmiranda921/calamanCy), my attempt to create
-spaCy pipelines for Tagalog. Its name is based on kalamansi, a citrus fruit
-common in the Philippines. Unfortunately, it's something that I've been working
-on in my spare time, so progress is slower than usual! This blog post contains
-my experiments on building the NER part of the pipeline. I plan to add a
-dependency parser and POS tagger from Universal Dependencies in the future.
-
-That's all for now. Feel free to hit me up if you have any questions and want to
-collaborate! Maraming salamat!
--->
-
 
 
 ## References