[redcap] new module #9474

regisoc · 2024-11-16T18:33:48Z

Brief summary of changes (WIP)

This PR adds the REDCap interoperability module.
This module opens an API endpoint to receive notifications from REDCap servers through the use of the Data Entry Trigger (DET). REDCap notifications are parsed and REDCap records are imported into LORIS.
If an error happens during this notification handling process, an error is generated in the issue tracker.
The module is imported from HBCD, and even if the core logic is the same, REDCap objects are now their own models.
This version also supports connection to multiple REDCap instances.

Features

The main goal was to replicate HBCD data import process based on REDCap DET.

Refactoring of the REDCap HTTP Client and linked classes/models.
Support for multiple REDCap instances.
Additional tools and features:
- redcap2linst: script to import REDCap forms as LORIS LINST instruments.
- fetch button: UI button on instrument panel to trigger data import (could be done later)

State

Testing instructions (if applicable)

Notification process:

Define main assignee, REDCap instances and projects in the config.xml following the structure given in RB/config.xml file.
Define which instruments could be imported by adding them to the configuration list in LORIS. See Front-end > Admin > Configuration > REDCap > redcap_importable_instrument.
Send a URL notification to LORIS (i.e. endpoint /redcap/notifications) to simulate the reception of a DET notification. The payload should be the following (see redcap/notifications/redcapnotification.class.inc):

{
    "instrument":"instrument_backend_name",
    "project_id":"REDCap project ID",
    "project_url":"REDCap project URL",
    "record":"REDCap record ID / PSCID",
    "redcap_event_name":"REDCap event name",
    "redcap_url":"REDCap instance URL",
    "username":"REDCap username",
    "${instrument_backend_name}_complete":"2"
}

REDCap LINST instrument process:

See tools/redcap2linst.php usage.

Link(s) to related issue(s)

RedCAP Interoperability in Core #9473

Links to related PRs

SQL/0000-00-01-Modules.sql

maximemulder

Finally finished reviewing the code, will test it now.

Okay, I finished reviewing the code, I am now in the process of testing it with MPN. Congratulations ! The module seems to work quite well and provides really convenient configuration options like multi-instance and multi-project integration. In addition you also use statically typed models instead of dictionaries directly, which I really like.

I wrote a bunch of small code comments, which you can also ignore if you do not agree with them, especially those marked with SUBJECTIVE.

On a broader level, I would have the following architectural comments:

I think we should use readonly properties for the configuration and model classes, which would help to remove a lot of boilerplate code.
I would prefer to avoid abbreviated variable names like pid, iname and rc, and instead use longer, more descriptive, names like project_id, instance_name and redcap_client. I know this does not play well with the 85-character line limit, but well, the problem is the limit here, not the longer names. I would also prefer more precise names like instrument_name, event_name and record_id instead of instrument, event and record when that makes sense.
In variable and type names, REDCap is sometimes written Redcap and sometimes REDCap, I think we should stick with one (IMO the simpler Redcap for code, and the "official" REDCap for documentation).
The configuration parser can be simplified. The current code builds the global configuration first, and then mutates it to add the instance and project configurations. It would be simpler to descend the XML tree, gather the instance configurations, and only then build the global configuration (same reasoning applies for instance and project configurations).
I need to have an additional REDCap project configuration parameter for MPN that changes the behavior of the pipeline. However, the configuration is currently not accessible from the REDCap notification handler. I think maybe there should be one configuration object accessible from there that contains all the configuration information for the specific project and instance of this call (instead of having a tree).
There are several singletons and factories, but I am not sure if these are really needed. If they do not persist across requests they should be removed IMO.
Same thing with the REDCap HTTP client cache. Is there a specific case that needs it ? If not, I think the abstraction should either be factorized so that it does not need to be manually read/written in each HTTP request call, or removed.
Same thing with the REDCap HTTP client handler. I am not sure managing different clients is necessary. Probably what should happen is: 1. Get the REDCap instance and project from the DET notification, 2. Get the REDCap configuration for that instance and project from the configuration file, 3. instantiate a REDCap HTTP client from that configuration and pass it as an argument to the REDCap notification handler, without necessarily bookkeeping instantiated clients like now (unless there is a good reason to do so).
The RedcapNotification class should be part of the REDCap HTTP client models, as it is structured JSON data obtained by HTTP from REDCap.

modules/redcap/php/configurations/redcapconfiguration.class.inc

modules/redcap/php/configurations/redcapconfigurationparser.class.inc

modules/redcap/php/models/redcapprop.class.inc

modules/redcap/php/module.class.inc

maximemulder · 2024-12-19T10:12:25Z

UPDATE: Since Regis is taking (well-deserved) time off, I am working on this PR myself and adding the required configuration options for MPN (that are hopefully also useful for other projects !).

A few things I noticed during my tests:

In the REDCap get project API call, the creation time and production time fields are nullable, I changed the types in the model to reflect that.
In the REDCap get event API call, MPN also returns the day_offset, offset_min and offset_max fields. MPN uses REDCap v14.0.28. Maybe these fields are optional or depend on the REDCap version.
In the REDCap get event API call, the current code expects the REDCap "event name" and "unique event name" to be related. However, that assumption does not always hold when special characters are involved (parentheses, hyphens, underscores, spaces...). I removed the check here.

maximemulder · 2024-12-24T09:37:20Z

Okay, I made some changes to the module for it to be usable for MPN, and made a few refactors that make the code (IMO) cleaner. Still a few TODOs notably regarding error handling, but I think that's a nice progress. List of sifnificant changes:

Added new configuration options and a documentation for the configuration.
Reworked the REDCap configuration parsing.
Reworked the REDCap HTTP client construction (no more HTTP REDCap client handler).
Moved all the REDCap client -related classes and models in redcap/php/client (to be renamed redcap/php/redcap_client ?).
Use readonly instead of getters in the models to reduce boilerplate.
Added a RedcapProps class to validate REDCap response objects (replacement for RedcapProp).

With all these changes, I removed almost all of my previous code comments except for a few ones I am still interested about.

Notably TODOs:

Finish error handling.
Migrate RedcapProject to RedcapProps and readonly.
Is the cache useful in the REDCap HTTP client ? It should be either removed or factorized IMO.
IMO the REDCap HTTP client does too many things. Code that is not directly related to calling the REDCap API and formatting results should probably be extracted.

CI seems broken because of an APT dependency, this is unrelated to the changes made here.

regisoc · 2025-01-02T15:16:04Z

@maximemulder happy new year, and thanks for adding more to this PR while I was away!
I did not look into the code yet, just based on your comments:

In the REDCap get project API call, the creation time and production time fields are nullable, I changed the types in the model to reflect that. -> types, good.
In the REDCap get event API call, MPN also returns the day_offset, offset_min and offset_max fields. MPN uses REDCap v14.0.28. Maybe these fields are optional or depend on the REDCap version. -> not sure as we do not have a clear vision on REDCap changelog. Was that added to the redcapevent class as optional args?
In the REDCap get event API call, the current code expects the REDCap "event name" and "unique event name" to be related. However, that assumption does not always hold when special characters are involved (parentheses, hyphens, underscores, spaces...). I removed the check here. -> the event_name is the redcap event name (e.g. 'aaa' or a LORIS visit/timepoint name in our case) and the unique_event_name concatenates the arm number with an underscore. From what we talked about with Rida based on HBCD way of naming, I thought we wanted them to be linked, thus the test. The only special characters should be an underscore between the base event name and the arm number. Is MPN using other chars?
Added new configuration options and a documentation for the configuration. -> there was a doc before and it is difficult to see what was there before with the force-push, thanks for updating the documentation.
Reworked the REDCap HTTP client construction (no more HTTP REDCap client handler). -> good if you have removed it, I did not like it but it was a first draft. I will look into this, because I just want to be sure we can access multiple REDCap instances.
Moved all the REDCap client -related classes and models in redcap/php/client (to be renamed redcap/php/redcap_client ?). -> I do not mind that. redcap/php/client looks good to me.
Use readonly instead of getters in the models to reduce boilerplate. -> great!
Added a RedcapProps class to validate REDCap response objects (replacement for RedcapProp). -> ok. Is the old RedcapProp still needed?

Todo:

Finish error handling. -> the issue creation in the issue tracker is still working, I wanted to add more specific LORIS-REDCap related exceptions.
Migrate RedcapProject to RedcapProps and readonly. -> not sure to follow here? Is it the only one missing in the transition from RedcapProp to RedcapProps?
Is the cache useful in the REDCap HTTP client ? It should be either removed or factorized IMO. -> yes it is needed and yes is will be factorized. DET triggers a looot of notifications and having big items cached (i.e. the data dictionary, events, arms, instruments list) per REDCap instance is a must IMO. It was not a priority but I put the base of it.
IMO the REDCap HTTP client does too many things. Code that is not directly related to calling the REDCap API and formatting results should probably be extracted. -> not sure to get this, which methods?

CI issue -> not related to phan type checks?

I will look into that for next week and I would like to have a quick meeting too so we can align on the next steps.

maximemulder

I have done all the changes I wanted to do on the REDCap module (I am just finishing to update the tests), and it now seems to cover all my needs for MPN.

I have two comments here. While I refactored the configuration, I removed the instance name, but you can add it back if you want (IMO using the URL instead of the name is enough), it should be easy to add such options with the refactored configuration and parser.

modules/redcap/php/redcapnotificationhandler.class.inc

maximemulder · 2025-01-20T23:50:38Z

modules/redcap/php/redcapnotificationhandler.class.inc

+        if (isset($instrument_values['dtt'])) {
+            $dt = \DateTime::createFromFormat(
+                'Y-m-d H:i:s',
+                $instrument_values['dtt'],
+            );
+
+            if (!$dt) {
+                error_log(
+                    "[redcap] Could not parse 'dtt': "
+                    . $instrument_values['dtt']
+                );
+            } else {
+                $instrument_values['Date_taken'] = $dt->format('Y-m-d');
+            }
+        }
+
+        // if null/empty, try getting that based on the timestamp
+        if (isset($instrument_values['Date_taken'])
+            || empty($instrument_values['Date_taken'])
+        ) {
+            if (isset($instrument_values['timestamp'])) {
+                $dt = \DateTime::createFromFormat(
+                    'Y-m-d H:i:s',
+                    $instrument_values['timestamp']
+                );
+                if (!$dt) {
+                    error_log(
+                        "[redcap] Could not parse 'timestamp': "
+                        . $instrument_values['timestamp']
+                    );
+                } else {
+                    $instrument_values['Date_taken'] = $dt->format('Y-m-d');
+                }
+            }
+        }
+
+        // if null/empty, try getting that based on the timestamp_start
+        if (isset($instrument_values['Date_taken'])
+            || empty($instrument_values['Date_taken'])
+        ) {
+            if (isset($instrument_values['timestamp_start'])) {
+                $dt = \DateTime::createFromFormat(
+                    'Y-m-d H:i:s',
+                    $instrument_values['timestamp_start']
+                );
+                if (!$dt) {
+                    error_log(
+                        "[redcap] Could not parse 'timestamp_start': "
+                        . $instrument_values['timestamp_start']
+                    );
+                } else {
+                    $instrument_values['Date_taken'] = $dt->format('Y-m-d');
+                }
+            }
+        }
+
+        // if still null/empty, get the current date
+        if (isset($instrument_values['Date_taken'])
+            || empty($instrument_values['Date_taken'])
+        ) {
+            $dtNow = new \DateTimeImmutable();
+            $instrument_values['Date_taken'] = $dtNow->format('Y-m-d');
+        }
+
+        // add the timestamp_stop in the values based on the last timestamp
+        if (isset($instrument_values['timestamp'])
+            && !empty($instrument_values['timestamp'])
+        ) {
+            // rename var to uniformize with other LORIS instruments
+            // Duration will be calculated when _saveValues is called.
+            $instrument_values['timestamp_stop'] = $instrument_values['timestamp'];
+        }


I am not sure I understand all this code. It seems rather hacky and has been a little fragile in my experience (I had to move some things around for it to work on MPN). It would be best to refactor it IMO.

HBCD case, most of the timestamps were not certain to be there at some point, but we were still in need of a data of administration for each instrument Date_taken.

regisoc

Quickly tested, I have a consistent 500 error that might not be related. Trying to debug that.

modules/redcap/CONFIGURATION.md

regisoc · 2025-01-21T05:33:11Z

modules/redcap/php/queries.class.inc

+ * @license  http://www.gnu.org/licenses/gpl-3.0.txt GPLv3
+ * @link     https://www.github.com/aces/Loris/
+ */
+class Queries


Used in notification handler only?

I think so, but it is not intended and it could be extended with other queries to be used elsewhere if wanted.

TBH it might be good to split the notification handler into several classes for different tasks (get LORIS instrument, validate record dictionary, update instrument...) but I did not have the time for that.

I like the Query idea but same here, a lack of time.
Yes, we talked about splitting tasks and I wanted to it, but again, lack of time.
If it is working, I guess a refactoring can be done later.

Agree, the module is currently ok in terms of code, can be improved but nothing blocking if it works IMO.

modules/redcap/php/client/models/redcapprop.class.inc

modules/redcap/php/client/models/redcapdictionaryrecord.class.inc

driusan reviewed Nov 18, 2024

View reviewed changes

SQL/0000-00-01-Modules.sql Outdated Show resolved Hide resolved

CamilleBeau assigned regisoc Nov 25, 2024

maximemulder reviewed Dec 12, 2024

View reviewed changes

maximemulder force-pushed the 20241020_redcap_module branch from 61d5c00 to 28f3733 Compare December 18, 2024 10:08

maximemulder force-pushed the 20241020_redcap_module branch 7 times, most recently from c72c69b to 3472533 Compare December 24, 2024 09:35

maximemulder force-pushed the 20241020_redcap_module branch from 3472533 to 5223553 Compare January 2, 2025 10:11

regisoc added 10 commits January 7, 2025 18:33

redcap - patch

fce04a2

redcap - RB struct config

76ebe60

redcap - sql

0cf49be

redcap - new module

914692c

redcap - lint client

6883932

redcap - lint endpoint notification

39929b1

redcap - lint notification handler

1df071c

redcap - lint notification

0582cd5

redcap - lint module

c56f318

redcap - lint redcap client handler

5b5cec6

maximemulder added 7 commits January 7, 2025 18:33

minor changes

3986774

remove debug line

56dbf10

fix php warnings

8fa8677

fix import path

b8d17b2

add truefalse field export

2cb5b97

fix redcap2linst

f999ace

fix redcapnotification namespace

3478e29

maximemulder force-pushed the 20241020_redcap_module branch from db54f90 to 3478e29 Compare January 7, 2025 23:33

maximemulder added 9 commits January 7, 2025 18:34

fix typo

99588be

fix redcap record redcap event name field

424946c

update documentation

4248c51

add redcap parser error messages

244e3c1

prefix instrument variable config

849bed5

handle file type

e6e3d9a

more errors

ff3a9da

use config issue assignee

ff1f0b6

update module tests

3f4f464

maximemulder reviewed Jan 20, 2025

View reviewed changes

maximemulder added 2 commits January 20, 2025 23:57

remove comment

051644c

improve phan

81b5e55

regisoc commented Jan 21, 2025

View reviewed changes

maximemulder reviewed Jan 21, 2025

View reviewed changes

modules/redcap/php/client/models/redcapdictionaryrecord.class.inc Outdated Show resolved Hide resolved

regisoc and others added 8 commits January 28, 2025 16:14

sql delete statement

4982008

test config ns

1cff260

rebase

0e28acd

test plan - intro/single instance

d3321f4

finish refactor redcap models

59975bf

fix lints

e2d004b

update raisinbread redcap config

97af3e5

fix typing

d2b35ae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[redcap] new module #9474

[redcap] new module #9474

regisoc commented Nov 16, 2024 •

edited

Loading

maximemulder left a comment •

edited

Loading

maximemulder commented Dec 19, 2024 •

edited

Loading

maximemulder commented Dec 24, 2024 •

edited

Loading

regisoc commented Jan 2, 2025 •

edited

Loading

maximemulder left a comment

maximemulder Jan 20, 2025 •

edited

Loading

regisoc Jan 28, 2025

regisoc left a comment

regisoc Jan 21, 2025

maximemulder Jan 21, 2025

regisoc Jan 28, 2025

maximemulder Jan 28, 2025

[redcap] new module #9474

Are you sure you want to change the base?

[redcap] new module #9474

Conversation

regisoc commented Nov 16, 2024 • edited Loading

Brief summary of changes (WIP)

Features

State

Testing instructions (if applicable)

Link(s) to related issue(s)

Links to related PRs

maximemulder left a comment • edited Loading

Choose a reason for hiding this comment

maximemulder commented Dec 19, 2024 • edited Loading

maximemulder commented Dec 24, 2024 • edited Loading

regisoc commented Jan 2, 2025 • edited Loading

maximemulder left a comment

Choose a reason for hiding this comment

maximemulder Jan 20, 2025 • edited Loading

Choose a reason for hiding this comment

regisoc Jan 28, 2025

Choose a reason for hiding this comment

regisoc left a comment

Choose a reason for hiding this comment

regisoc Jan 21, 2025

Choose a reason for hiding this comment

maximemulder Jan 21, 2025

Choose a reason for hiding this comment

regisoc Jan 28, 2025

Choose a reason for hiding this comment

maximemulder Jan 28, 2025

Choose a reason for hiding this comment

regisoc commented Nov 16, 2024 •

edited

Loading

maximemulder left a comment •

edited

Loading

maximemulder commented Dec 19, 2024 •

edited

Loading

maximemulder commented Dec 24, 2024 •

edited

Loading

regisoc commented Jan 2, 2025 •

edited

Loading

maximemulder Jan 20, 2025 •

edited

Loading