Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement logging of profile changes (username, password, image, ...) #1708

Open
wants to merge 2 commits into
base: v4.3
Choose a base branch
from

Conversation

kainhofer
Copy link
Contributor

@kainhofer kainhofer commented Nov 23, 2024

  • Add table [PREFIX]_users_profile_log for changes to username, password
  • Log changes to that table
  • Also implement logging creating/deleting users
  • Merge entries with profile field log in profile_field_history.php

So far, logging was only implemented for user fields (name, email, etc.)
stored in dthe _user_data table, but not for core profile fields (username,
password, photo, ...) in the _users table. This patch implements logging
changes to those profile fields into a new table _users_profile_log, which
works similarly to the _user_log table. The ChangeNotification class had
already implemented most of the required framework, only the actual storing
to the database was missing and is added with this patch. This patch also
adds logging of user creation and deletion (although deletions will be
erased from the database immediately afer logging).

The change history page (profile_field_history.php) is extended to also
read out the log entris from the new _users_profile_log table and merge
them with the user field change log. Like in the edit user screen, there
will be no difference to the user, even though the technology/tables in
the background are different.

Fixes #326
See also #1069 (implements parts 2 and 3)

* Add table [PREFIX]_users_profile_log for changes to username, password
  * Log changes to that table
  * Also implement logging creating/deleting users
  * Merge entries with profile field log in profile_field_history.php

So far, logging was only implemented for user fields (name, email, etc.)
stored in dthe _user_data table, but not for core profile fields (username,
password, photo, ...) in the _users table. This patch implements logging
changes to those profile fields into a new table _users_profile_log, which
works similarly to the _user_log table.  The ChangeNotification class had
already implemented most of the required framework, only the actual storing
to the database was missing and is added with this patch. This patch also
adds logging of user creation and deletion (although deletions will be
erased from the database immediately afer logging).

The change history page (profile_field_history.php) is extended to also
read out the log entris from the new _users_profile_log table and merge
them with the user field change log. Like in the edit user screen, there
will be no difference to the user, even though the technology/tables in
the background are different.

Fixes Admidio#326
See also Admidio#1069 (implements parts 2 and 3)
@Fasse
Copy link
Member

Fasse commented Nov 24, 2024

I like the logic to log also password, username, image and membership data.

But I think it's not useful to implement a separate logging table for each kind of field changes we want to log. I think a more generic logging table for field changes would be useful.

Something like this:

CREATE TABLE %PREFIX%_log_field_changes
(
    log_id               
    log_table       
    log_record_id 
    log_value_old               text                NULL,
    log_value_new               text                NULL,
    log_usr_id_create           integer unsigned    NULL,
    log_timestamp_create        timestamp           NOT NULL    DEFAULT CURRENT_TIMESTAMP,
    log_comment                 varchar(255)        NULL,
    PRIMARY KEY (log_id)
)

With this we could also implement one generic html output page and could then log every changes of all objects if we want.

@kainhofer
Copy link
Contributor Author

kainhofer commented Nov 28, 2024

That would be a generic structure of the logging table, but it has a few issues:

  1. It assumes that the affected record will still exist in the database. For profile fields that might be true, but for memberships this is not true in general. In particular, I'm particularly interested in (intentional or inadvertent) membership deletions. In these cases, the corresponding membership record is no longer available, so the logging table needs to contain all relevant information.
  2. Filtering is harder, as it depends on the table: the _users table can/must be filtered by its main key usr_id, the _user_data table by the non-key column "usd_usr_id", and memberships are filtered by user (mem_usr_id) AND group (mem_rol_id). In the Groups & Roles view, I need to see only membership changes pertaining to the currently selected group (i.e. filtered by mem_rol_id), while in the individual profile view, I want all membership changes to be filtered by the user (mem_usr_id).
  3. Accessing / indexing the field that is changed is dependent on the base table: user profile fields have one entry in the _user_data that specifies the field and the value. Core profile fields like username, password or photo are one column in the referenced record. The same goes for membership changes. -> We need additional, which column of the referenced table is affected (and the handling depends on the underlying log_table).
  4. How would we log record/object creation and deletion?

So, combining the issues, we'd need to store the data of the referenced record in the log record, including all columns that are needed for filtering -> currently at least two columns available for filtering, plus one or two columns indicating the field that is changed (currently, the _user_log table stores the field ID, while for core profile fields there is no such thing, as they are separate columns in the _users table)...

@Fasse
Copy link
Member

Fasse commented Nov 30, 2024

You are right a generic method will be more complex than my example db structure shows. For now we can go with your solution but I think a generic method will be very useful because it will be less work to integrate a good UI and don't double the code.

@kainhofer
Copy link
Contributor Author

kainhofer commented Dec 6, 2024

After a little more thought, the following log table structure might be generic enough to handle most changes, while still allowing rather simple reporting / filtering:

CREATE TABLE %PREFIX%_log_changes
(
    log_id            int(10) NOT NULL,
    log_table     varchar(255),  -- SQL table name without prefix (to allow copying/cloning the DB with different prefix)

    log_record_id int(10),  -- The ID of the modified record in the above-defined table
    log_record_name text,  -- a human-readable (or translatable) representation of the record, as the record might have been deleted meanwhile
    log_object_id int(10) NULL, -- Second-level filter (e.g. Memberships have usr_id and role / mem_id)
    log_object_name text NULL, -- again, human-readable representation
    log_field varchar(255) NULL, -- Definition of the modified field (usf_id for user fields, column table names for profile fields or membership attributes or other more generic changes)
    log_field_name   varchar(255) NULL, -- again, human-readable representation

    log_action varchar(255), -- enum of "MODIFY", "CREATED", "DELETED"
    log_value_old               text                NULL,  -- if "MODIFY", old value
    log_value_new               text                NULL -- if "MODIFY", new value,

    log_usr_id_create           integer unsigned    NULL,
    log_timestamp_create        timestamp           NOT NULL    DEFAULT CURRENT_TIMESTAMP,
    log_comment                 varchar(255)        NULL,
    PRIMARY KEY (log_id)
)

Regarding membership modifications, I'm not sure about the best choice of selectors/filter columns, as we want to filter by usr_id and rol_id, but we also have the mem_id of the membership object... I didn't want to add three filter columns to the table...

@kainhofer
Copy link
Contributor Author

Potential entries to the log table for various types of changes (leaving out the unique key volumn and the modification user and timestamps):

Current log (changes to profile fields, table _user_data):

(
    "user_data", 
    500, "Kainhofer, Reinhold",  -- usr_id and human-readable representation
    -1, NULL, 
    18, "SYS_XING", -- usr_id: ID and name for row in user_fields table 

    "MODIFY", NULL, "https://www.xing.com/profile/TEST_USER"
)

Changes to core user profile (_users):

(
   "users",
    500, "Kainhofer, Reinhold",  -- usr_id and human-readable representation
    NULL, NULL,
    "usr_login_name", "SYS_USERNAME", -- column in the _users table and translatable identifier

   "MODIFY", "testuser", "testuser2"
)

New user created (_users) - deletion similar:

(
    "users",
    500, "Kainhofer, Reinhold",  -- usr_id and human-readable representation
    NULL, NULL,
    NULL, NULL,

    "CREATED" , NULL, NULL -- "DELETED" for user deletions, all other fields are identical (no information anyway)
)

Membership change (_members):

TODO:

  • We have a membership ID, the user ID and the role

    • which of those should be stored in the log table?
    • I chose the membership ID (might be deleted) and the role name as human-readable text, which is preserved in the log table even if the membership record is deleted!]
    • On the other hand, we want to filter both usr_id (for the changelog of the individual user) and rol_id (for the changelog of the role in the role membership / contacts page). This is currently a problem with my suggested table structure...
  • Maybe we need three filter-level items?

    (
        "members",
        500, "Kainhofer, Reinhold",  -- usr_id and human-readable representation
        2, "Mitglied", -- mem_id from _members table and rol_name from table _roles (human-readable)
        NULL, NULL,
    
        "CREATED", NULL, NULL
    )
    (
        "members",
        500, "Kainhofer, Reinhold",  -- usr_id and human-readable representation
        2, "Mitglied",             -- mem_id from _members table and rol_name from table _roles (human-readable)
        "mem_begin", "…….", -- column in _members table and translatable identifier
    
        "MODIFY", NULL, "2024-12-08"
    )
    (
        "members",
        500, "Kainhofer, Reinhold",  // usr_id and human-readable representation
        2, "Mitglied", // mem_id from _members table and rol_name from table _roles (human-readable)
        NULL, NULL,
    
        "DELETED", NULL, NULL
    )
    

Profile field definition changes (table _user_fields)

(
    "user_fields",
    10, "BIRTHDAY",   -- usf_id and column usf_name_intern from table user_fields
    NULL, NULL,
    "usf_name", "usf_name", -- column in _user_fields table and translatable identifier

   "MODIFY", "Geburtstag", "geboren"
)

Role definitions (table _roles)

(
    "roles",
    2, "Mitglied",                             -- rol_id and (translatable / human-readable) rol_name from table _roles
    NULL, NULL,
    "rol_description", "rol_description",  -- column in _roles table and translatable identifier

    "MODIFY", "OLD DESCRIPTION", " All members" 
)

@Fasse
Copy link
Member

Fasse commented Dec 7, 2024

I don't know why you want to store these two data columns log_object_id and log_object_name. You have the record_id. Each table in your database has a unique ID. Also membership is represented by mem_id which will be stored in record_id. With that info and the log_table you can create a JOIN to the membership and read informations about role and user.

I think the database structure could be more simple. Later the UI or the email message code must be a little more complex to create that JOIN and present the necessary informations. But this should not be stored in the log-database table.

@kainhofer
Copy link
Contributor Author

When you delete a membership (as opposed to setting an end date), the corresponding record is deleted from the database, so all log entries pertaining to that membership entry use a record_id that no longer exists.
For profile fields, the record will not be deleted, so a table join is possible. The same is the case if a membership is modified / ended with an explicit end date. But when it is deleted altogether, there is no entry any more to do a join.
The only solution I see is to duplicate that information in the log table, so it is preserved even if the affected record is later on deleted from the database.

One of my main pain points currently with admidio is that when a membership is deleted, there is no trace of it any more in the database. We had the situation that a new administrative intern in our association inadvertently deleted memberships when the membership was just suspended.... Given that we are a professional association, handling accreditations that not only influence professional standing, but also are a requirement for promotion to certain positions in insurance companies, there is nothing more embarassing than us loosing information about an existing membership!

That's the reason why our secretary still needs to keep track of all changes in a separate Excel file manually, even though we switched to Admidio three years ago.

Currenly, the data handling of Admidio is not revision-safe, as things like membership changes (and deletions) are not tracked and if a membership is deleted, there is no trace of it any more.

@Fasse
Copy link
Member

Fasse commented Dec 16, 2024

Ok, you are right. Maybe the membership is a special case and need therefore an own log table. So members_log is a valid table.

What do you think about the users-profile-log. That table could have the generic approach, so that we don't have a separate log table only for three profile fields. Or the existing table users_log would get an additional field like user_table_column where the name of the changed column is stored. Then the user_table_column or the usf_id must be filled within that table. It's not the best solution but could be better handled in opposite to a separate table.

@kainhofer
Copy link
Contributor Author

Thanks for your thoughts about the log table. To be honest, the more I think about it, I'm warming up to a generic table. However, in my view, a log must store all relevant information from the time of the change directly. Links to e.g. the user or the role can help, but the user or role description / name should not be dynamic.

For memberships, I realized that we need to store all three IDs: mem_id as the object id that was changed (but which does not have its own html page, so we cannot html-link to it), and the usr_id and role ids that can/should be links to the corresponding profile and groups-roles pages.

However, I'm thinking about the implementation. The current log page uses a hardcoded SQL statement to retrieve the log. When the log table is generic and applies to all different kinds of objects, it makes sense to implement it more generic. What do you think? Would it make sens to create a new TableLog class to handle DB access to the log_changes table? That could move the saving functionality from the ChangeNotification class and the table formatting from the profile_field_history.php to the TableLog class. Advantage would be that formatting of logs for additional objects (e.g. when changes to the role or user/profile field definitions are also stored in the log table) could all be done in one place.

The formatting of the object will heavily depend on the DB table of the original record, e.g. for a change to a record in %PREFIX%_users, we want to add a link to the user's profile page, while a change to a membership will have both a link to the user's profile, as well as the group/role affected. And a change to a profile field definition will need a link to the field. Also, a user's change log page should list profile field changes as well as membership changes. The meaning of the object ID in these cases is very different, so each line needs its own formatter (depending on the table).

On the other hand, if I understand the architecture right, the TableAccess class is mainly tailored to individual records and not so much to long lists of DB entries. Would it still make sense?

@kainhofer
Copy link
Contributor Author

Or the existing table users_log would get an additional field like user_table_column where the name of the changed column is stored.

My current idea is to have a generic (string) column for the field. If table=="user_data", then it should contain the custom field ID, while for table=="users" it will contain the name of the column (e.g. "usr_login_name"). The meaning would then depend on the table name.
Since we cannot use DB constraints and foreign keys with a generic table (even if we just join the _users and _user_data logs to one logging table), at least we are not technically forced to have a unique, clear meaning of the column.

@Fasse
Copy link
Member

Fasse commented Dec 29, 2024

Sorry for my late answer. I thought it was your turn to response. But as I look today it was me to write.

I'm with you. Membership is a special case and could be a special log table. For the others it could be that generic table and as you described it could link to a special column field e.g. "usr_login_name" or to a generic ID field e.g. the ID of user fields.

The presentation may be not so dynamic as the table itself. There it could be that we have to switch the different cases for the links depending on the logged table. But I thinks these are only a few places in the UI where we have that different cases but it will be a huge advantage to implement the basic log UI only once because a lot of the UI could be used in all cases.

Maybe it would make sense to implement these feature only for version 5.0 because I think this will be a huge step and some work. Maybe the membership-log could be implemented also in 4.3 because you have done this already.

@kainhofer
Copy link
Contributor Author

I've started implementing a logging mechanism that is located deep inside the TableAccess class (inside the ::save method), so that each and every database record change can be logged to a generic changelog table. Advantage is that other objects like user field definitions, preferences, events etc. are automatically logged, too.

The hard part now is to figure out which changes should not be logged, and for the others, which fields should be logged how so that a generic changelog view is able to properly display and even link to the correct records.

Logging of changes to _users and _user_data is fully implemented and polished, all other tables are implemented, but need some more polish. Also, I have not yet implemented any setting to prevent logging for certain tables (by default, I exclude 'log_changes', 'id', 'auto_login', 'sessions', 'components', 'registrations', but that is not yet configurable).

Once I have finished a first version of the changelog view, I'll submit a pull request for first feedback. That's not meant to be merged, but rather to get your ideas about certain choices I made.

@kainhofer
Copy link
Contributor Author

Example of current database entries for the logging table, including changes to user field definitions, announcements, categories, user profile data and profile fields:
image
I manually deleted a few spurious/irrelevant log entries, which should not be created in the first place... That's one part of the aforementioned polishing.
The proper logging of the members table still needs some work, but users and user_data is more or less working (i.e. current state).

@Fasse
Copy link
Member

Fasse commented Jan 1, 2025

That sounds like a good implementation.

Are you implementing this in the 4.3 branch or in the master? Since you mentioned the TableAccess you are using the branch because the class is renamed to Entity in the master. Are there any reasons why you are using the branch because of the huge changes to the codebase there will be much work to implement this again in the master.

@Fasse Fasse removed this from the v4.3.13 milestone Jan 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants