Skip to content

Commit

Permalink
Files and Folders refactor
Browse files Browse the repository at this point in the history
  • Loading branch information
IFFranciscoME committed Oct 2, 2024
1 parent f9318eb commit a8cff3b
Show file tree
Hide file tree
Showing 22 changed files with 85 additions and 42 deletions.
8 changes: 5 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,19 @@

knowledge/
*.pdf
*.gguf

# -- Rust ------------------------------------------------------------------- #
# -- ---- ------------------------------------------------------------------- #

/debug
/target
Cargo.lock
/rust/target
/rust/Cargo.lock
*.gguf

# -- Python ----------------------------------------------------------------- #
# -- ------ ----------------------------------------------------------------- #

.env
.env/
*.cpython-311-darwin.so

19 changes: 16 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,19 @@
# molina

Welcome to the `molina` project, a synthetic research asistant for local knowledge representation.
Welcome to the `molina` project, a synthetic research agent for local knowledge representation.

Notice here the absence of the terms `Artificial`, `Intelliget` and similar, this is not an `artificially sweetened` project, is not a claim to be the panacea, neither the $n-th$ attempt to solve general let alone narrow intelligence, because this is definitely not intelligent. It is a tool, built with an agentic approach, for you interact with by the act of formulating research-related questions, which will get you responses using ONLY the academic papers you provide as the knowledge base.
Notice here the absence of the terms `Artificial` and `Intelligence`, this is deliverate.
This is not an `artificially sweetened` project, is not a claim to be the panacea, neither
the $n-th$ attempt to solve general let alone narrow intelligence because this is definitely
not a claim of something being intelligent.

What is this then ?, a tool, built with an agentic approach, for you to interact with by the
act of formulating research-related questions, which will get you responses using *only* the
academic papers you provide as the *knowledge base*.

The name is in honor to THE greatest Mexican researcher of all times, [Mario Molina (1943-2020)](https://es.wikipedia.org/wiki/Mario_Molina_(químico)).

## Problem
## Problems

Challenges of the use of Large Language Models (LLMs), as an academic research assistant, are:

Expand All @@ -16,3 +23,9 @@ Challenges of the use of Large Language Models (LLMs), as an academic research a
- Bias in Training Data.
- Interpretability and Response Attribution.

## Project Structure

- Python
- Rust
- Models

Empty file added models/README.md
Empty file.
1 change: 1 addition & 0 deletions python/molina/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

2 changes: 0 additions & 2 deletions python/molina/main.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
from molina import read_pdf

def main():
# result = add(5, 7)
# print(f"The result of addition is now: {result}")
read_pdf("knowledge/case_1/Advances-Prospect-Kahneman-Tversky-1992.pdf")

if __name__ == "__main__":
Expand Down
Binary file removed python/molina/molina.cpython-311-darwin.so
Binary file not shown.
File renamed without changes.
1 change: 1 addition & 0 deletions rust/src/content/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
## Content Module
File renamed without changes.
33 changes: 20 additions & 13 deletions src/content/parse.rs → rust/src/content/parse.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
use lopdf::{Document, Object};
use std::collections::BTreeMap;
use serde::{Deserialize, Serialize};
use std::io::{Error, ErrorKind, Write};
use std::fs::File;
use std::fmt::Debug;
use serde_json;
use std::collections::BTreeMap;
use std::fmt::Debug;
use std::fs::File;
use std::io::{Error, ErrorKind, Write};
use std::path::{Path, PathBuf};
use std::time::Instant;

Expand Down Expand Up @@ -60,7 +60,8 @@ fn filter_func(object_id: (u32, u16), object: &mut Object) -> Option<((u32, u16)
}

fn load_pdf<P: AsRef<Path>>(path: P) -> Result<Document, Error> {
Document::load_filtered(path, filter_func).map_err(|e| Error::new(ErrorKind::Other, e.to_string()))
Document::load_filtered(path, filter_func)
.map_err(|e| Error::new(ErrorKind::Other, e.to_string()))
}

fn get_pdf_text(doc: &Document) -> Result<PdfText, Error> {
Expand Down Expand Up @@ -102,7 +103,12 @@ fn get_pdf_text(doc: &Document) -> Result<PdfText, Error> {
Ok(pdf_text)
}

fn pdf2text<P: AsRef<Path> + Debug>(path: P, output: P, pretty: bool, password: &str) -> Result<(), Error> {
fn pdf2text<P: AsRef<Path> + Debug>(
path: P,
output: P,
pretty: bool,
password: &str,
) -> Result<(), Error> {
println!("Load {path:?}");
let mut doc = load_pdf(&path)?;
if doc.is_encrypted() {
Expand All @@ -126,19 +132,20 @@ fn pdf2text<P: AsRef<Path> + Debug>(path: P, output: P, pretty: bool, password:
Ok(())
}


pub fn pdf_generate() -> Result<(), Error> {

let start_time = Instant::now();
let pdf_path = PathBuf::from("knowledge/case_1/Advances-Prospect-Kahneman-Tversky-1992.pdf".to_string());
let pdf_path =
PathBuf::from("knowledge/case_1/Advances-Prospect-Kahneman-Tversky-1992.pdf".to_string());
let mut output = PathBuf::from("knowledge/case_1/file_texted.pdf");
output.set_extension("text");

let pretty = true;
let passw = "abc123";
pdf2text(&pdf_path, &output, pretty, passw)?;
println!("Executed after {:.1} seconds.", Instant::now().duration_since(start_time).as_secs_f64());

println!(
"Executed after {:.1} seconds.",
Instant::now().duration_since(start_time).as_secs_f64()
);

Ok(())
}

1 change: 1 addition & 0 deletions rust/src/data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
## Data Module
2 changes: 2 additions & 0 deletions rust/src/data/db.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
// Placeholder

1 change: 1 addition & 0 deletions rust/src/data/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// Placeholder
1 change: 1 addition & 0 deletions rust/src/inference/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
## Inference Module
1 change: 0 additions & 1 deletion src/inference/llama.rs → rust/src/inference/llama.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

use llama_cpp_rs::{
options::{ModelOptions, PredictOptions},
LLama,
Expand Down
File renamed without changes.
32 changes: 32 additions & 0 deletions rust/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
//! Molina, in honor to THE greatest Mexican researcher of all times, Mario Molina (1943-2020).
//!
//! A Rust and Python Synthetic Integration for an agentic-LLM approach to build a research agent
//! for local knowledge representation.
use crate::content::parse;
use pyo3::prelude::*;

/// Tools for content parsing and generation
pub mod content;

/// Tools for data accessing, I/O, compression.
pub mod data;

/// Tools for making an inference computation
pub mod inference;

/// Events, Custom Error Types, Logs
pub mod messages;

#[pyfunction]
fn read_pdf(_file_path: &str) -> PyResult<()> {
let read_result = parse::pdf_generate();
println!("temporal result is this {:?}", read_result);
Ok(())
}

#[pymodule]
fn molina(_py: Python, m: &PyModule) -> PyResult<()> {
m.add_function(wrap_pyfunction!(read_pdf, m)?)?;
Ok(())
}
1 change: 1 addition & 0 deletions rust/src/messages/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
## Messages Module
2 changes: 2 additions & 0 deletions rust/src/messages/errors.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@

// Placeholder
1 change: 1 addition & 0 deletions rust/src/messages/logs.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// Placeholder
1 change: 1 addition & 0 deletions rust/src/messages/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// Placeholder
20 changes: 0 additions & 20 deletions src/lib.rs

This file was deleted.

0 comments on commit a8cff3b

Please sign in to comment.