-
Notifications
You must be signed in to change notification settings - Fork 36
Expand file tree
/
Copy pathneo4j_import.py
More file actions
1960 lines (1670 loc) · 89.1 KB
/
neo4j_import.py
File metadata and controls
1960 lines (1670 loc) · 89.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
"""
Neo4j Import Script - Import Welcome Season data into Neo4j graph database via GraphQL.
This script uses the GraphQL endpoint to execute Cypher queries through the executeCypher mutation.
Neo4j handles embeddings and vector search automatically.
This script creates nodes with properties and relationships from the Excel data.
Node Types:
- POC: Points of Contact records
- WarRoom: War Room teams
- ApplicationContact: Product/Engineering contacts
- StaticKnowledge: FAQ-style knowledge
- Person: Individual contacts extracted from data
- Area: Teams/Areas/Departments
- Sheet: Source Excel sheets
Relationships:
- (:POC)-[:HAS_CONTACT]->(:Person)
- (:POC)-[:BELONGS_TO]->(:Area)
- (:POC)-[:MANAGED_BY]->(:Person)
- (:WarRoom)-[:HAS_VP]->(:Person)
- (:WarRoom)-[:HAS_PRIMARY_LEAD]->(:Person)
- (:WarRoom)-[:HAS_SECONDARY_LEAD]->(:Person)
- (:ApplicationContact)-[:HAS_PRODUCT_LEAD]->(:Person)
- (:ApplicationContact)-[:HAS_ENGINEERING_LEAD]->(:Person)
- (:Document)-[:FROM_SHEET]->(:Sheet)
- (:Person)-[:WORKS_IN]->(:Area)
Usage:
python neo4j_import.py --verify
python neo4j_import.py --uri https://your-graphql-endpoint/graphql/ --user admin --password secret
"""
import pandas as pd
from pathlib import Path
from typing import Dict, Any, List, Set, Tuple
import re
import requests
import json
class Neo4jImporter:
"""Import Welcome Season data into Neo4j with relationships."""
# Role to label mapping for Person nodes
ROLE_LABELS = {
'VP': 'VP',
'Primary Lead': 'Lead',
'Secondary Lead': 'Lead',
'POC Manager': 'Manager',
'WS Point of Contact': 'POC',
'Engineering/Product Leader': 'Leader',
'Product Lead': 'ProductLead',
'Product Contact': 'ProductContact',
'Engineering Lead': 'EngineeringLead',
'Engineering Contact': 'EngineeringContact',
}
# Role to expertise category mapping
ROLE_EXPERTISE = {
'VP': 'Leadership',
'Primary Lead': 'Leadership',
'Secondary Lead': 'Leadership',
'POC Manager': 'Management',
'WS Point of Contact': 'Operations',
'Engineering/Product Leader': 'Engineering',
'Product Lead': 'Product',
'Product Contact': 'Product',
'Engineering Lead': 'Engineering',
'Engineering Contact': 'Engineering',
}
def __init__(self, uri: str = "http://localhost:7474/graphql", user: str = "admin", password: str = "password"):
"""Initialize GraphQL connection with basic authentication."""
self.base_url = uri
self.auth = (user, password)
base_dir = Path(__file__).parent
# Multiple Excel files
self.excel_file = base_dir / "WS2026 POCs - War Rooms.xlsx"
self.coverage_schedule_file = base_dir / "CMKDigital-Emerson_WS2026 Coverage Schedule.xlsx"
self.support_roster_file = base_dir / "2026_WS_Support_Roster.xlsx"
# Track created entities for relationship building
self.persons: Dict[str, str] = {} # name -> node_id
self.person_roles: Dict[str, Set[str]] = {} # name -> set of roles
self.areas: Dict[str, str] = {} # area_name -> node_id
self.schedules: Dict[str, str] = {} # schedule_id -> node_id
self.support_shifts: Dict[str, str] = {} # shift_id -> node_id
def close(self):
"""Close the connection (no-op for HTTP)."""
pass
def _execute_query(self, query: str, **kwargs) -> List[Dict[str, Any]]:
"""Execute a Cypher query via GraphQL API."""
parameters = kwargs if kwargs else {}
headers = {
"Content-Type": "application/json"
}
# Wrap Cypher query in GraphQL executeCypher mutation
graphql_query = """
mutation ExecuteCypher($statement: String!, $parameters: JSON) {
executeCypher(input: {statement: $statement, parameters: $parameters}) {
columns
rows
rowCount
executionTimeMs
}
}
"""
response = requests.post(
self.base_url,
headers=headers,
auth=self.auth,
json={
"query": graphql_query,
"variables": {
"statement": query,
"parameters": parameters
}
}
)
# 200 and 202 are both success responses
if response.status_code in (200, 202):
data = response.json()
# Check for GraphQL errors first
if "errors" in data:
error_msg = json.dumps(data['errors'], indent=2)
raise Exception(f"GraphQL errors: {error_msg}")
# Extract results from GraphQL response
if "data" in data and "executeCypher" in data["data"]:
result = data["data"]["executeCypher"]
# Convert GraphQL result format to Neo4j-like format
rows = result.get("rows", [])
columns = result.get("columns", [])
# Unwrap values from {"value": ...} format
unwrapped_rows = []
for row in rows:
unwrapped_row = []
for cell in row:
if isinstance(cell, dict) and "value" in cell:
unwrapped_row.append(cell["value"])
else:
unwrapped_row.append(cell)
unwrapped_rows.append(unwrapped_row)
# Format as [{row: [values], meta: []}] for compatibility
return [{"row": row, "meta": []} for row in unwrapped_rows]
else:
# No data returned - might be a CREATE/DELETE that succeeded
return []
else:
error_text = response.text[:500] if response.text else "No error message"
raise Exception(f"GraphQL query failed: {response.status_code} - {error_text}")
def clear_database(self):
"""Clear all nodes and relationships."""
try:
self._execute_query("MATCH (n) DETACH DELETE n")
print("🗑️ Database cleared")
# Verify the clear worked
result = self._execute_query("MATCH (n) RETURN count(n) as total")
if result and result[0].get('row'):
count = result[0]['row'][0]
print(f" Verified: {count} nodes remaining")
except Exception as e:
print(f"⚠️ Error clearing database: {e}")
raise
def create_constraints_and_indexes(self):
"""Create constraints and indexes for better performance."""
constraints = [
"CREATE CONSTRAINT IF NOT EXISTS FOR (p:POC) REQUIRE p.id IS UNIQUE",
"CREATE CONSTRAINT IF NOT EXISTS FOR (w:WarRoom) REQUIRE w.id IS UNIQUE",
"CREATE CONSTRAINT IF NOT EXISTS FOR (a:ApplicationContact) REQUIRE a.id IS UNIQUE",
"CREATE CONSTRAINT IF NOT EXISTS FOR (d:Document) REQUIRE d.id IS UNIQUE",
"CREATE CONSTRAINT IF NOT EXISTS FOR (p:Person) REQUIRE p.id IS UNIQUE",
"CREATE CONSTRAINT IF NOT EXISTS FOR (a:Area) REQUIRE a.id IS UNIQUE",
"CREATE CONSTRAINT IF NOT EXISTS FOR (s:Sheet) REQUIRE s.name IS UNIQUE",
"CREATE CONSTRAINT IF NOT EXISTS FOR (c:CoverageSchedule) REQUIRE c.id IS UNIQUE",
"CREATE CONSTRAINT IF NOT EXISTS FOR (s:SupportSchedule) REQUIRE s.id IS UNIQUE",
"CREATE CONSTRAINT IF NOT EXISTS FOR (s:SupportRoster) REQUIRE s.id IS UNIQUE",
]
for constraint in constraints:
try:
self._execute_query(constraint)
except Exception as e:
if "already exists" not in str(e).lower():
print(f"⚠️ Constraint warning: {e}")
# Create indexes for common search fields
indexes = [
"CREATE INDEX IF NOT EXISTS FOR (p:Person) ON (p.name)",
"CREATE INDEX IF NOT EXISTS FOR (a:Area) ON (a.name)",
"CREATE INDEX IF NOT EXISTS FOR (d:Document) ON (d.content)",
]
for index in indexes:
try:
self._execute_query(index)
except Exception:
pass
print("✅ Constraints and indexes created")
def create_sheet_nodes(self):
"""Create nodes for source sheets from all Excel files."""
sheets = [
# From WS2026 POCs - War Rooms.xlsx
{'name': 'WS2026 POCs', 'description': 'Welcome Season 2026 Points of Contact', 'source_file': 'WS2026 POCs - War Rooms.xlsx'},
{'name': 'War Rooms', 'description': 'War Room teams and schedules', 'source_file': 'WS2026 POCs - War Rooms.xlsx'},
{'name': 'PS Digital Product-Eng', 'description': 'Product and Engineering contacts', 'source_file': 'WS2026 POCs - War Rooms.xlsx'},
{'name': 'CSX Resolution Center Delegates', 'description': 'CSX Resolution Center business area delegates', 'source_file': 'WS2026 POCs - War Rooms.xlsx'},
# From CMKDigital-Emerson_WS2026 Coverage Schedule.xlsx
{'name': 'Coverage Schedule', 'description': 'Welcome Season 2026 coverage schedules and speak times', 'source_file': 'CMKDigital-Emerson_WS2026 Coverage Schedule.xlsx'},
# From 2026_WS_Support_Roster.xlsx
{'name': 'Support Roster', 'description': 'Welcome Season 2026 support roster with availability', 'source_file': '2026_WS_Support_Roster.xlsx'},
# Static knowledge
{'name': 'Static Knowledge', 'description': 'FAQ and general information', 'source_file': 'System'},
]
for sheet in sheets:
self._execute_query("""
MERGE (s:Sheet {name: $name})
SET s.description = $description,
s.source_file = $source_file
""", name=sheet['name'], description=sheet['description'], source_file=sheet.get('source_file', 'Unknown'))
print(f" ✅ Created {len(sheets)} Sheet nodes")
def _normalize_name(self, name: str) -> str:
"""Normalize a person's name for consistent matching."""
if not name:
return ""
# Remove phone numbers first
name = re.sub(r'\d{3}[-.\s]?\d{3}[-.\s]?\d{4}', '', name)
# Extract name before parentheses (person name is usually before the description)
# e.g., "Diana Newell (RxClaim Adj, CCA/COS, ...)" -> "Diana Newell"
paren_match = re.match(r'^([^(]+)\(', name)
if paren_match:
name = paren_match.group(1).strip()
else:
# If no parentheses, take first part before comma (if comma-separated list)
name = re.sub(r'[,;].*$', '', name)
name = name.strip().strip(',').strip()
return name
def _is_valid_person_name(self, name: str) -> bool:
"""Validate that a string looks like a person's name."""
if not name or len(name) < 2:
return False
name_lower = name.lower().strip()
# Reject common non-name patterns
invalid_patterns = [
r'^layer\)?$', # "Layer" or "Layer)"
r'^mypbm\)?$', # "myPBM" or "myPBM)"
r'^comm\s+layer\)?$', # "Comm Layer" or "Comm Layer)"
r'^[^a-zA-Z]*$', # No letters at all
r'^\d+$', # Only numbers
r'^[A-Z]{2,}$', # Only uppercase acronyms (like "CDH", "COS")
r'^[a-z]+$', # Only lowercase (likely not a name)
r'\)$', # Ends with just closing paren (incomplete extraction)
r'^[^a-zA-Z]+$', # No letters
]
for pattern in invalid_patterns:
if re.match(pattern, name_lower):
return False
# Must contain at least one letter and look like a name
# Names typically have: first letter uppercase, rest mixed case, or multiple words
if not re.search(r'[a-zA-Z]', name):
return False
# Check if it looks like a name (has spaces or is a reasonable single name)
# Single names should be at least 3 chars and have mixed case or be a known pattern
words = name.split()
if len(words) == 1:
# Single word - should be at least 3 chars and not all caps (unless it's a short name)
if len(name) < 3:
return False
# Reject if it's all caps and longer than 4 chars (likely acronym)
if name.isupper() and len(name) > 4:
return False
# Reject if it's clearly a system/application name
system_keywords = ['layer', 'mypbm', 'comm', 'outputs', 'adj', 'pde', 'dur',
'eligibility', 'vendor', 'transition', 'network', 'prescriber',
'claim', 'rxclaim', 'cdh', 'cos', 'cca', 'sbo', 'r&r', 'eztest',
'colas', 'mft', 'b2b', 'myclaims']
if any(keyword in name_lower for keyword in system_keywords):
return False
return True
def _extract_phone(self, text: str) -> str:
"""Extract phone number from text."""
if not text:
return None
match = re.search(r'(\d{3}[-.\s]?\d{3}[-.\s]?\d{4})', text)
return match.group(1) if match else None
def get_or_create_person(self, name: str, phone: str = None, role: str = None) -> str:
"""Get or create a Person node, return the node ID."""
if not name or name.strip() == '' or name.lower() in ['nan', 'none', 'n/a']:
return None
normalized = self._normalize_name(name)
if not normalized:
return None
# Validate that this looks like a person name
if not self._is_valid_person_name(normalized):
return None
# Check cache
cache_key = normalized.lower()
# Track roles for this person
if cache_key not in self.person_roles:
self.person_roles[cache_key] = set()
if role:
self.person_roles[cache_key].add(role)
# Get all roles and derive labels/expertise
roles = list(self.person_roles[cache_key])
labels = list(set(self.ROLE_LABELS.get(r, '') for r in roles if self.ROLE_LABELS.get(r)))
expertise = list(set(self.ROLE_EXPERTISE.get(r, '') for r in roles if self.ROLE_EXPERTISE.get(r)))
if cache_key in self.persons:
# Update existing person with new role info
if role:
self._execute_query("""
MATCH (p:Person {name: $name})
SET p.roles = $roles,
p.labels = $labels,
p.expertise = $expertise
""", name=normalized, roles=roles, labels=labels, expertise=expertise)
# Add dynamic label based on role
label = self.ROLE_LABELS.get(role)
if label:
self._execute_query(f"""
MATCH (p:Person {{name: $name}})
SET p:{label}
""", name=normalized)
return self.persons[cache_key]
# Create unique ID
person_id = f"person_{len(self.persons)}"
# Create node with roles array and expertise
result = self._execute_query("""
MERGE (p:Person {name: $name})
ON CREATE SET p.id = $id, p.phone = $phone, p.roles = $roles, p.labels = $labels, p.expertise = $expertise
ON MATCH SET p.phone = COALESCE(p.phone, $phone),
p.roles = $roles,
p.labels = $labels,
p.expertise = $expertise
RETURN p.id as id
""", name=normalized, id=person_id, phone=phone, roles=roles, labels=labels, expertise=expertise)
# Add dynamic label based on role
if role:
label = self.ROLE_LABELS.get(role)
if label:
self._execute_query(f"""
MATCH (p:Person {{name: $name}})
SET p:{label}
""", name=normalized)
actual_id = result[0]['row'][0] if result and result[0].get('row') else person_id
self.persons[cache_key] = actual_id
return actual_id
def get_or_create_area(self, area_name: str) -> str:
"""Get or create an Area node, return the node ID."""
if not area_name or area_name.strip() == '' or area_name.lower() in ['nan', 'none', 'n/a', 'area/team']:
return None
area_name = area_name.strip()
cache_key = area_name.lower()
if cache_key in self.areas:
return self.areas[cache_key]
area_id = f"area_{len(self.areas)}"
self._execute_query("""
MERGE (a:Area {name: $name})
ON CREATE SET a.id = $id
RETURN a.id
""", name=area_name, id=area_id)
self.areas[cache_key] = area_id
return area_id
def import_pocs(self, df: pd.DataFrame) -> int:
"""Import POCs with relationships."""
count = 0
for idx, row in df.iterrows():
# Use named columns from spreadsheet
area_team = row.get('Area/Team')
if pd.isna(area_team) or str(area_team).strip() == '':
# Try to get data from other columns even if area is empty
has_data = any(pd.notna(row.get(col)) for col in ['Engineering and Product Leaders', 'WS Point of Contact', "POC's Manager(s)"])
if not has_data:
continue
area_team = None
else:
area_team = str(area_team).strip()
props = {
'id': f"poc_{idx}",
'area_team': area_team,
'engineering_product_leaders': str(row.get('Engineering and Product Leaders')).strip() if pd.notna(row.get('Engineering and Product Leaders')) else None,
'ws_point_of_contact': str(row.get('WS Point of Contact')).strip() if pd.notna(row.get('WS Point of Contact')) else None,
'poc_managers': str(row.get("POC's Manager(s)")).strip() if pd.notna(row.get("POC's Manager(s)")) else None,
'product_application_service': str(row.get('Product, Application or Service')).strip() if pd.notna(row.get('Product, Application or Service')) else None,
'source_sheet': 'WS2026 POCs',
'row_number': int(idx),
}
props = {k: v for k, v in props.items() if v is not None}
# Build content for search
text_parts = [f"{k}: {v}" for k, v in props.items() if k not in ['id', 'source_sheet', 'row_number']]
props['content'] = '\n'.join(text_parts)
# Create POC node
try:
self._execute_query("""
CREATE (p:POC:Document $props)
RETURN p.id as id
""", props=props)
except Exception as e:
print(f"⚠️ Error creating POC node {props.get('id')}: {e}")
continue
# Create relationships
poc_id = props['id']
# Link to Sheet
self._execute_query("""
MATCH (p:POC {id: $poc_id}), (s:Sheet {name: 'WS2026 POCs'})
MERGE (p)-[:FROM_SHEET]->(s)
""", poc_id=poc_id)
# Link to Area
if area_team:
area_id = self.get_or_create_area(area_team)
if area_id:
self._execute_query("""
MATCH (p:POC {id: $poc_id}), (a:Area {name: $area_name})
MERGE (p)-[:BELONGS_TO]->(a)
""", poc_id=poc_id, area_name=area_team.strip())
# Link to WS Point of Contact (Person)
ws_contact = props.get('ws_point_of_contact')
if ws_contact:
# For contacts with descriptions in parentheses, extract the name part only
# e.g., "Diana Newell (RxClaim Adj, ...)" -> just process "Diana Newell"
# Don't split by comma if there are parentheses (the comma is part of the description)
if '(' in ws_contact and ')' in ws_contact:
# Extract name before parentheses
normalized_contact = self._normalize_name(ws_contact)
if normalized_contact and self._is_valid_person_name(normalized_contact):
phone = self._extract_phone(ws_contact)
person_id = self.get_or_create_person(normalized_contact, phone, 'WS Point of Contact')
if person_id:
self._execute_query("""
MATCH (p:POC {id: $poc_id}), (person:Person {name: $person_name})
MERGE (p)-[:HAS_CONTACT {role: 'WS Point of Contact'}]->(person)
""", poc_id=poc_id, person_name=normalized_contact)
else:
# Handle multiple contacts separated by comma or semicolon (no parentheses)
contacts = re.split(r'[,;]', ws_contact)
for contact in contacts:
contact = contact.strip()
if contact:
normalized_contact = self._normalize_name(contact)
if normalized_contact and self._is_valid_person_name(normalized_contact):
phone = self._extract_phone(contact)
person_id = self.get_or_create_person(normalized_contact, phone, 'WS Point of Contact')
if person_id:
self._execute_query("""
MATCH (p:POC {id: $poc_id}), (person:Person {name: $person_name})
MERGE (p)-[:HAS_CONTACT {role: 'WS Point of Contact'}]->(person)
""", poc_id=poc_id, person_name=normalized_contact)
# Link to POC Managers
managers = props.get('poc_managers')
if managers:
# Handle names with descriptions in parentheses
if '(' in managers and ')' in managers:
normalized_manager = self._normalize_name(managers)
if normalized_manager and self._is_valid_person_name(normalized_manager):
person_id = self.get_or_create_person(normalized_manager, None, 'POC Manager')
if person_id:
self._execute_query("""
MATCH (p:POC {id: $poc_id}), (person:Person {name: $person_name})
MERGE (p)-[:MANAGED_BY]->(person)
""", poc_id=poc_id, person_name=normalized_manager)
else:
manager_list = re.split(r'[,;]', managers)
for manager in manager_list:
manager = manager.strip()
if manager:
normalized_manager = self._normalize_name(manager)
if normalized_manager and self._is_valid_person_name(normalized_manager):
person_id = self.get_or_create_person(normalized_manager, None, 'POC Manager')
if person_id:
self._execute_query("""
MATCH (p:POC {id: $poc_id}), (person:Person {name: $person_name})
MERGE (p)-[:MANAGED_BY]->(person)
""", poc_id=poc_id, person_name=normalized_manager)
# Link engineering/product leaders
leaders = props.get('engineering_product_leaders')
if leaders:
# Handle names with descriptions in parentheses
if '(' in leaders and ')' in leaders:
normalized_leader = self._normalize_name(leaders)
if normalized_leader and self._is_valid_person_name(normalized_leader):
person_id = self.get_or_create_person(normalized_leader, None, 'Engineering/Product Leader')
if person_id:
self._execute_query("""
MATCH (p:POC {id: $poc_id}), (person:Person {name: $person_name})
MERGE (p)-[:HAS_LEADER]->(person)
""", poc_id=poc_id, person_name=normalized_leader)
else:
leader_list = re.split(r'[,;]', leaders)
for leader in leader_list:
leader = leader.strip()
if leader:
normalized_leader = self._normalize_name(leader)
if normalized_leader and self._is_valid_person_name(normalized_leader):
person_id = self.get_or_create_person(normalized_leader, None, 'Engineering/Product Leader')
if person_id:
self._execute_query("""
MATCH (p:POC {id: $poc_id}), (person:Person {name: $person_name})
MERGE (p)-[:HAS_LEADER]->(person)
""", poc_id=poc_id, person_name=normalized_leader)
count += 1
return count
def import_war_rooms(self, df: pd.DataFrame) -> int:
"""Import War Rooms with relationships."""
count = 0
for idx, row in df.iterrows():
team_name = row.get('War Room Team')
if pd.isna(team_name) or str(team_name).strip() == '':
continue
# Schedule date columns
schedule_dates = [
'Thur. 1/1', 'Fri. 1/2', 'Sat. J1/3', 'Sun. 1/4', 'Mon. 1/5',
'Tues. 1/6', 'Wed. 1/7', 'Thurs. 1/8', 'Fri. 1/9', 'Sat. 1/10',
'Sun. 1/11', 'Mon. 1/12', 'Tues. 1/13', 'Wed. 1/14', 'Thurs. 1/15',
'Fri. 1/16', 'Sat. 1/17', 'Sun. 1/18', 'Mon. 1/19', 'Tues. 1/20'
]
props = {
'id': f"warroom_{idx}",
'team_name': str(team_name).strip(),
'meeting_link': str(row.get('Meeting Link', '')) if pd.notna(row.get('Meeting Link')) else None,
'vp': str(row.get('VP', '')) if pd.notna(row.get('VP')) else None,
'primary_lead': str(row.get('Primary Lead & Phone #`', '')) if pd.notna(row.get('Primary Lead & Phone #`')) else None,
'secondary_lead': str(row.get('Secondary Lead & Phone #', '')) if pd.notna(row.get('Secondary Lead & Phone #')) else None,
'location_type': str(row.get('Virtual, Onsite (physical location), Combined (physical location)', '')) if pd.notna(row.get('Virtual, Onsite (physical location), Combined (physical location)')) else None,
'onsite_vendors': str(row.get('Onsite Vendors', '')) if pd.notna(row.get('Onsite Vendors')) else None,
'source_sheet': 'War Rooms',
'row_number': int(idx),
}
# Add all schedule dates
schedule_info = {}
for date_col in schedule_dates:
if date_col in row.index and pd.notna(row.get(date_col)):
# Normalize column name to valid property key
key = date_col.lower().replace('.', '').replace(' ', '_').replace('/', '_')
schedule_info[key] = str(row.get(date_col))
props['schedule'] = schedule_info if schedule_info else None
props = {k: v for k, v in props.items() if v is not None}
# Build content including schedule for search
text_parts = [f"{k}: {v}" for k, v in props.items() if k not in ['id', 'source_sheet', 'row_number', 'schedule']]
if schedule_info:
text_parts.append(f"Schedule: {schedule_info}")
props['content'] = '\n'.join(text_parts)
# Create WarRoom node
self._execute_query("CREATE (w:WarRoom:Document $props)", props=props)
warroom_id = props['id']
# Link to Sheet
self._execute_query("""
MATCH (w:WarRoom {id: $warroom_id}), (s:Sheet {name: 'War Rooms'})
MERGE (w)-[:FROM_SHEET]->(s)
""", warroom_id=warroom_id)
# Link to Area (team name as area)
area_id = self.get_or_create_area(props['team_name'])
if area_id:
self._execute_query("""
MATCH (w:WarRoom {id: $warroom_id}), (a:Area {name: $area_name})
MERGE (w)-[:BELONGS_TO]->(a)
""", warroom_id=warroom_id, area_name=props['team_name'])
# Link to VP
vp = props.get('vp')
if vp:
person_id = self.get_or_create_person(vp, None, 'VP')
if person_id:
self._execute_query("""
MATCH (w:WarRoom {id: $warroom_id}), (p:Person {name: $person_name})
MERGE (w)-[:HAS_VP]->(p)
""", warroom_id=warroom_id, person_name=self._normalize_name(vp))
# Link to Primary Lead
primary = props.get('primary_lead')
if primary:
phone = self._extract_phone(primary)
person_id = self.get_or_create_person(primary, phone, 'Primary Lead')
if person_id:
self._execute_query("""
MATCH (w:WarRoom {id: $warroom_id}), (p:Person {name: $person_name})
MERGE (w)-[:HAS_PRIMARY_LEAD]->(p)
""", warroom_id=warroom_id, person_name=self._normalize_name(primary))
# Link to Secondary Lead
secondary = props.get('secondary_lead')
if secondary:
phone = self._extract_phone(secondary)
person_id = self.get_or_create_person(secondary, phone, 'Secondary Lead')
if person_id:
self._execute_query("""
MATCH (w:WarRoom {id: $warroom_id}), (p:Person {name: $person_name})
MERGE (w)-[:HAS_SECONDARY_LEAD]->(p)
""", warroom_id=warroom_id, person_name=self._normalize_name(secondary))
count += 1
return count
def import_application_contacts(self, df: pd.DataFrame) -> int:
"""Import Application Contacts with relationships."""
count = 0
for idx, row in df.iterrows():
structure = row.get('nornic Ops Structure')
if pd.isna(structure) or str(structure).strip() == '':
continue
props = {
'id': f"appcontact_{idx}",
'nornic_ops_structure': str(structure).strip(),
'cmk_com': str(row.get('CMK.com', '')) if pd.notna(row.get('CMK.com')) else None,
'cvsh_app': str(row.get('CVSH App', '')) if pd.notna(row.get('CVSH App')) else None,
'cvsh_app_mapping': str(row.get('CVSH App Mapping', '')) if pd.notna(row.get('CVSH App Mapping')) else None,
'product_lead': str(row.get('Product Lead', '')) if pd.notna(row.get('Product Lead')) else None,
'product_contact': str(row.get('Product Contact', '')) if pd.notna(row.get('Product Contact')) else None,
'engineering_lead': str(row.get('Engineering Lead', '')) if pd.notna(row.get('Engineering Lead')) else None,
'engineering_contact': str(row.get('Engineering Contact', '')) if pd.notna(row.get('Engineering Contact')) else None,
'super_app_product_lead': str(row.get('Super App Product Lead', '')) if pd.notna(row.get('Super App Product Lead')) else None,
'super_app_product_contact': str(row.get('Super App Product Contact', '')) if pd.notna(row.get('Super App Product Contact')) else None,
'super_app_engineering_lead': str(row.get('Super App Engineering Lead', '')) if pd.notna(row.get('Super App Engineering Lead')) else None,
'super_app_engineering_contact': str(row.get('Super App Engineering Contact', '')) if pd.notna(row.get('Super App Engineering Contact')) else None,
'source_sheet': 'PS Digital Product-Eng',
'row_number': int(idx),
}
props = {k: v for k, v in props.items() if v is not None}
text_parts = [f"{k}: {v}" for k, v in props.items() if k not in ['id', 'source_sheet', 'row_number']]
props['content'] = '\n'.join(text_parts)
self._execute_query("CREATE (a:ApplicationContact:Document $props)", props=props)
app_id = props['id']
# Link to Sheet
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (s:Sheet {name: 'PS Digital Product-Eng'})
MERGE (a)-[:FROM_SHEET]->(s)
""", app_id=app_id)
# Link to Area (ops structure as area)
area_id = self.get_or_create_area(props['nornic_ops_structure'])
if area_id:
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (area:Area {name: $area_name})
MERGE (a)-[:BELONGS_TO]->(area)
""", app_id=app_id, area_name=props['nornic_ops_structure'])
# Link Product Lead
product_lead = props.get('product_lead')
if product_lead:
person_id = self.get_or_create_person(product_lead, None, 'Product Lead')
if person_id:
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (p:Person {name: $person_name})
MERGE (a)-[:HAS_PRODUCT_LEAD]->(p)
""", app_id=app_id, person_name=self._normalize_name(product_lead))
# Link Product Contact
product_contact = props.get('product_contact')
if product_contact:
person_id = self.get_or_create_person(product_contact, None, 'Product Contact')
if person_id:
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (p:Person {name: $person_name})
MERGE (a)-[:HAS_PRODUCT_CONTACT]->(p)
""", app_id=app_id, person_name=self._normalize_name(product_contact))
# Link Engineering Lead
eng_lead = props.get('engineering_lead')
if eng_lead:
person_id = self.get_or_create_person(eng_lead, None, 'Engineering Lead')
if person_id:
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (p:Person {name: $person_name})
MERGE (a)-[:HAS_ENGINEERING_LEAD]->(p)
""", app_id=app_id, person_name=self._normalize_name(eng_lead))
# Link Engineering Contact
eng_contact = props.get('engineering_contact')
if eng_contact:
person_id = self.get_or_create_person(eng_contact, None, 'Engineering Contact')
if person_id:
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (p:Person {name: $person_name})
MERGE (a)-[:HAS_ENGINEERING_CONTACT]->(p)
""", app_id=app_id, person_name=self._normalize_name(eng_contact))
# Link Super App Product Lead
super_product_lead = props.get('super_app_product_lead')
if super_product_lead:
person_id = self.get_or_create_person(super_product_lead, None, 'Product Lead')
if person_id:
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (p:Person {name: $person_name})
MERGE (a)-[:HAS_SUPER_APP_PRODUCT_LEAD]->(p)
""", app_id=app_id, person_name=self._normalize_name(super_product_lead))
# Link Super App Product Contact
super_product_contact = props.get('super_app_product_contact')
if super_product_contact:
person_id = self.get_or_create_person(super_product_contact, None, 'Product Contact')
if person_id:
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (p:Person {name: $person_name})
MERGE (a)-[:HAS_SUPER_APP_PRODUCT_CONTACT]->(p)
""", app_id=app_id, person_name=self._normalize_name(super_product_contact))
# Link Super App Engineering Lead
super_eng_lead = props.get('super_app_engineering_lead')
if super_eng_lead:
person_id = self.get_or_create_person(super_eng_lead, None, 'Engineering Lead')
if person_id:
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (p:Person {name: $person_name})
MERGE (a)-[:HAS_SUPER_APP_ENGINEERING_LEAD]->(p)
""", app_id=app_id, person_name=self._normalize_name(super_eng_lead))
# Link Super App Engineering Contact
super_eng_contact = props.get('super_app_engineering_contact')
if super_eng_contact:
person_id = self.get_or_create_person(super_eng_contact, None, 'Engineering Contact')
if person_id:
self._execute_query("""
MATCH (a:ApplicationContact {id: $app_id}), (p:Person {name: $person_name})
MERGE (a)-[:HAS_SUPER_APP_ENGINEERING_CONTACT]->(p)
""", app_id=app_id, person_name=self._normalize_name(super_eng_contact))
count += 1
return count
def import_csx_delegates(self, df: pd.DataFrame) -> int:
"""Import CSX Resolution Center Delegates with relationships."""
count = 0
for idx, row in df.iterrows():
functional_area = row.get('PBM and Corporate Functional Area')
if pd.isna(functional_area) or str(functional_area).strip() == '':
continue
props = {
'id': f"csx_delegate_{idx}",
'functional_area': str(functional_area).strip(),
'business_group': str(row.get('Business Group', '')) if pd.notna(row.get('Business Group')) else None,
'business_group_owner': str(row.get('Business Group Owner', '')) if pd.notna(row.get('Business Group Owner')) else None,
'business_area': str(row.get('Business Area', '')) if pd.notna(row.get('Business Area')) else None,
'business_area_owner': str(row.get('Business Area Owner', '')) if pd.notna(row.get('Business Area Owner')) else None,
'delegates': str(row.get('Delegates', '')) if pd.notna(row.get('Delegates')) else None,
'source_sheet': 'CSX Resolution Center Delegates',
'row_number': int(idx),
}
props = {k: v for k, v in props.items() if v is not None}
text_parts = [f"{k}: {v}" for k, v in props.items() if k not in ['id', 'source_sheet', 'row_number']]
props['content'] = '\n'.join(text_parts)
self._execute_query("CREATE (c:CSXDelegate:Document $props)", props=props)
delegate_id = props['id']
# Link to Sheet
self._execute_query("""
MATCH (c:CSXDelegate {id: $delegate_id}), (s:Sheet {name: 'CSX Resolution Center Delegates'})
MERGE (c)-[:FROM_SHEET]->(s)
""", delegate_id=delegate_id)
# Link to Area (functional area)
area_id = self.get_or_create_area(props['functional_area'])
if area_id:
self._execute_query("""
MATCH (c:CSXDelegate {id: $delegate_id}), (a:Area {name: $area_name})
MERGE (c)-[:BELONGS_TO]->(a)
""", delegate_id=delegate_id, area_name=props['functional_area'])
# Link Business Group Owner
bg_owner = props.get('business_group_owner')
if bg_owner:
person_id = self.get_or_create_person(bg_owner, None, 'Business Group Owner')
if person_id:
self._execute_query("""
MATCH (c:CSXDelegate {id: $delegate_id}), (p:Person {name: $person_name})
MERGE (c)-[:HAS_BUSINESS_GROUP_OWNER]->(p)
""", delegate_id=delegate_id, person_name=self._normalize_name(bg_owner))
# Link Business Area Owner
ba_owner = props.get('business_area_owner')
if ba_owner:
person_id = self.get_or_create_person(ba_owner, None, 'Business Area Owner')
if person_id:
self._execute_query("""
MATCH (c:CSXDelegate {id: $delegate_id}), (p:Person {name: $person_name})
MERGE (c)-[:HAS_BUSINESS_AREA_OWNER]->(p)
""", delegate_id=delegate_id, person_name=self._normalize_name(ba_owner))
# Link Delegates (may be multiple, separated by commas)
delegates = props.get('delegates')
if delegates:
# Handle names with descriptions in parentheses
if '(' in delegates and ')' in delegates:
# If it's a single entry with parentheses, extract the name
normalized_delegate = self._normalize_name(delegates)
if normalized_delegate and self._is_valid_person_name(normalized_delegate):
person_id = self.get_or_create_person(normalized_delegate, None, 'Delegate')
if person_id:
self._execute_query("""
MATCH (c:CSXDelegate {id: $delegate_id}), (p:Person {name: $person_name})
MERGE (c)-[:HAS_DELEGATE]->(p)
""", delegate_id=delegate_id, person_name=normalized_delegate)
else:
# Multiple delegates separated by commas
delegate_list = re.split(r'[,;]', delegates)
for delegate in delegate_list:
delegate = delegate.strip()
if delegate:
normalized_delegate = self._normalize_name(delegate)
if normalized_delegate and self._is_valid_person_name(normalized_delegate):
person_id = self.get_or_create_person(normalized_delegate, None, 'Delegate')
if person_id:
self._execute_query("""
MATCH (c:CSXDelegate {id: $delegate_id}), (p:Person {name: $person_name})
MERGE (c)-[:HAS_DELEGATE]->(p)
""", delegate_id=delegate_id, person_name=normalized_delegate)
count += 1
return count
def import_static_knowledge(self) -> int:
"""Import static knowledge documents."""
static_docs = [
{
'id': 'static_schedule',
'title': 'Welcome Season 2026 Schedule',
'content': '''Welcome Season 2026 Schedule and Timeline
Welcome Season 2026 runs from January 1, 2026 through January 20, 2026.
Peak days are during the first three weeks of January.
Key dates:
- January 1, 2026: Welcome Season starts (New Year's Day)
- January 1-20, 2026: Peak period for War Rooms
- War Rooms operate daily during peak period
This is the annual period when new healthcare benefits take effect and member activity is highest.
Welcome Season is also known as WS2026 or Welcome Season '26.''',
'type': 'schedule',
},
{
'id': 'static_what_is_ws',
'title': 'What is Welcome Season',
'content': '''What is Welcome Season?
Welcome Season is the annual period at the beginning of the calendar year when:
- New health insurance benefits take effect for members
- Members call to understand their new coverage
- Prescription transfers and new enrollments are processed
- Call volumes are significantly higher than normal
- War Rooms are activated to handle increased demand
Welcome Season 2026 (WS2026) runs from January 1 through January 20, 2026.
This is a critical operational period for healthcare and pharmacy services.''',
'type': 'general_info',
},
{
'id': 'static_war_rooms',
'title': 'What are War Rooms',
'content': '''What are War Rooms?
War Rooms are dedicated operational command centers activated during Welcome Season.
Purpose:
- Coordinate response to high-volume periods
- Monitor system performance and issues
- Provide rapid escalation paths
- Ensure business continuity during peak demand
War Rooms are staffed by VPs, Primary Leads, and Secondary Leads.
They can be Virtual, Onsite (physical location), or Combined.
War Room schedules follow the Welcome Season peak period (January 1-20).''',
'type': 'general_info',
},
{
'id': 'static_find_pocs',
'title': 'How to find POCs',
'content': '''How to find Points of Contact (POCs)
To find the right contact for Welcome Season 2026:
1. Check the WS2026 POCs for team-specific contacts
2. Look up Area/Team name to find WS Point of Contact
3. POC Managers are listed for escalation
4. War Room contacts (VP, Primary Lead, Secondary Lead) handle operational issues
5. Application contacts have product and engineering contacts
Common search terms: POC, contact, manager, lead, who to call, escalation''',
'type': 'help',
},
{
'id': 'static_key_dates',
'title': 'Welcome Season 2026 Key Dates Calendar',
'content': '''Welcome Season 2026 Key Dates and Activity Levels
ACTIVITY LEVEL DEFINITIONS:
- PEAK: Highest volume days. All War Rooms fully staffed. Maximum readiness required. Expect highest call volumes and system load.
- HIGH: Elevated volume days. War Rooms active with enhanced staffing. Above-normal call volumes expected.
- BAU (Business As Usual): Normal operational levels. Standard staffing in effect. Regular monitoring continues.
COMPLETE DATE-BY-DATE SCHEDULE:
Week 1 (January 1-4, 2026):
- Thursday, January 1, 2026 (New Year's Day): PEAK
* Welcome Season officially begins
* New health insurance benefits take effect
* Highest initial call volume as members discover coverage changes
* All War Rooms activated
- Friday, January 2, 2026: PEAK
* Continued high volume from benefit activation
* First business day for many employers
* Heavy prescription transfer activity
- Saturday, January 3, 2026: BAU
* Weekend reduced staffing
* Lower call volumes typical
- Sunday, January 4, 2026: BAU
* Weekend operations continue
* Preparation for Monday peak
Week 2 (January 5-11, 2026):
- Monday, January 5, 2026: PEAK
* First full business week begins
* Major spike in employer-related inquiries
* System load at maximum
- Tuesday, January 6, 2026: HIGH
* Sustained elevated volumes
* Follow-up calls from Monday issues
- Wednesday, January 7, 2026: HIGH
* Mid-week elevated activity
* Processing backlog from peak days
- Thursday, January 8, 2026: PEAK
* Another peak day as issues accumulate
* Escalation volumes increase
- Friday, January 9, 2026: PEAK
* End of first full week