Comprehensive architecture documentation for src/core/parsing/, covering every class, dependency, protocol relationship, singleton, and the complete scanning/parsing data flows.
sequenceDiagram
autonumber
participant Caller as Caller<br>(ApplicationAPI)
participant API as Gem5ParserAPI
participant Scanner as Gem5Scanner
participant Pool as ScanWorkPool
participant Work as Gem5ScanWork
participant PerlScan as Gem5StatsScanner<br>(Singleton)
participant Perl as statsScanner.pl
participant TM as TypeMapper
participant PA as PatternAggregator
Note over Caller, PA: ━━━ SCANNING FLOW ━━━
Caller->>API: submit_scan_async(path, pattern, limit)
API->>Scanner: submit_scan_async()
Scanner->>Scanner: normalize_user_path(stats_path)
Scanner->>Scanner: rglob(pattern) → files[:limit]
loop For each stats file
Scanner->>Work: Gem5ScanWork(file_path)
end
Scanner->>Pool: submit_batch_async(work_items)
Pool-->>Caller: List[Future[List[ScannedVariable]]]
Note over Work,Perl: Parallel execution in ProcessPool
par Worker Process 1
Work->>PerlScan: get_instance().scan_file(path)
PerlScan->>Perl: subprocess.run(statsScanner.pl, file)
Perl-->>PerlScan: JSON output
PerlScan->>TM: map_scan_result(entry)
TM-->>PerlScan: normalized dict
PerlScan-->>Work: List[ScannedVariable]
and Worker Process N
Work->>PerlScan: get_instance().scan_file(path)
PerlScan->>Perl: subprocess.run(statsScanner.pl, file)
Perl-->>PerlScan: JSON output
PerlScan-->>Work: List[ScannedVariable]
end
Caller->>Caller: results = [f.result() for f in futures]
Caller->>API: aggregate_scan_results(results)
API->>Scanner: aggregate_scan_results()
Scanner->>Scanner: _merge_variable() for each var<br>Union entries, expand min/max
Scanner->>PA: aggregate_patterns(merged_vars)
Note over PA: cpu0, cpu1...cpu15<br>→ cpu\d+ [vector]
PA->>PA: _extract_pattern() per var name
PA->>PA: Group by pattern signature
PA->>PA: _create_pattern_variable()
PA-->>Scanner: List[ScannedVariable] (aggregated)
Scanner-->>Caller: Final scanned variables
Parsing Data Flow
sequenceDiagram
autonumber
participant Caller as Caller<br>(ApplicationAPI)
participant API as Gem5ParserAPI
participant Parser as Gem5Parser
participant SF as StrategyFactory
participant Strat as SimpleStatsStrategy
participant TM as TypeMapper
participant Pool as ParseWorkPool
participant Work as Gem5ParseWork
participant PP as PerlWorkerPool<br>(Singleton)
participant Perl as fileParserServer.pl
participant CSV as construct_final_csv
Note over Caller, CSV: ━━━ PARSING FLOW ━━━
Caller->>API: submit_parse_async(path, pattern, vars, out_dir, scanned_vars)
API->>Parser: submit_parse_async()
Note over Parser: Step 1: Regex Expansion
Parser->>Parser: For each var with regex chars:<br>Match against scanned_vars<br>Expand pattern_indices → leaf vars<br>Inject parsed_ids into params
Note over Parser: Step 2: Strategy Resolution
Parser->>SF: create(strategy_type)
SF-->>Parser: SimpleStatsStrategy | ConfigAwareStrategy
Note over Strat: Step 3: Work Item Generation
Parser->>Strat: get_work_items(path, pattern, configs)
Strat->>Strat: _get_files() → glob → List[str]
Strat->>TM: create_stat(config) for each var
TM->>TM: StatTypeRegistry.create(type, **kwargs)
TM-->>Strat: Dict[str, StatType]
Strat-->>Parser: List[Gem5ParseWork]
Note over Pool,Work: Step 4: Parallel Submission
Parser->>Pool: submit_batch_async(work_items)
Pool-->>Caller: List[Future[ParsedVarsDict]]
par Worker Process 1
Work->>PP: get_worker_pool()
Work->>PP: parse_file(path, var_keys)
PP->>Perl: stdin: PARSE|path|var1,var2,...
Perl-->>PP: stdout: Type/VarID/Value lines
PP-->>Work: List[str] output
Work->>Work: _processOutput()
Note over Work: Line format: Type/VarID/Value<br>Entry types → buffer<br>Scalar → content = value<br>Summary → aggregate
Work->>Work: _applyBufferedEntries()
Work->>Work: _validateVars()
Work-->>Pool: ParsedVarsDict
and Worker Process N
Work->>PP: parse_file(...)
PP->>Perl: stdin: PARSE|...|...
Perl-->>PP: output lines
Work-->>Pool: ParsedVarsDict
end
Caller->>Caller: results = [f.result() for f in futures]
Note over Caller,CSV: Step 5: Finalization
Caller->>API: finalize_parsing(out_dir, results, strategy)
API->>Parser: finalize_parsing()
Parser->>Strat: post_process(results)
Note over Strat: ConfigAware: augment with config.ini<br>Simple: passthrough
Parser->>CSV: construct_final_csv(out_dir, results)
Note over CSV: For each var in results:<br> balance_content() → pad to repeat<br> reduce_duplicates() → arithmetic mean<br> reduced_content → CSV values<br><br>Entry types → var..entry1, var..entry2<br>Scalar types → var
CSV-->>Caller: path/to/results.csv
class hierarchy
classDiagram
direction TB
class ParserProtocol {
<<Protocol>>
+submit_parse_async()$ List~Future~
+finalize_parsing()$ Optional~str~
+construct_final_csv()$ Optional~str~
}
class ScannerProtocol {
<<Protocol>>
+submit_scan_async()$ List~Future~
+aggregate_scan_results()$ List~ScannedVariable~
}
class ParserAPI {
<<Protocol>>
+submit_parse_async() List~Future~
+finalize_parsing() Optional~str~
+submit_scan_async() List~Future~
+aggregate_scan_results() List~ScannedVariable~
}
class Gem5ParserAPI {
+submit_parse_async()
+finalize_parsing()
+submit_scan_async()
+aggregate_scan_results()
}
class Gem5Parser {
+submit_parse_async()$
+finalize_parsing()$
+construct_final_csv()$
}
class Gem5Scanner {
+submit_scan_async()$
+aggregate_scan_results()$
-_merge_variable()$
}
ParserAPI --|> ParserProtocol
ParserAPI --|> ScannerProtocol
Gem5ParserAPI ..|> ParserAPI : implements
Gem5Parser ..|> ParserProtocol : implements
Gem5Scanner ..|> ScannerProtocol : implements
Gem5ParserAPI --> Gem5Parser : delegates
Gem5ParserAPI --> Gem5Scanner : delegates
class Job {
<<ABC>>
+__call__()* Any
}
class ParseWork {
+__call__() ParsedVarsDict
}
class ScanWork {
+__call__() Any
}
class Gem5ParseWork {
-fileToParse: str
-varsToParse: Dict
+__call__() ParsedVarsDict
-_runPerlScript() str
-_processOutput()
}
class Gem5ScanWork {
-file_path: str
+__call__() List~ScannedVariable~
}
ParseWork --|> Job
ScanWork --|> Job
Gem5ParseWork --|> ParseWork
Gem5ScanWork --|> ScanWork
class FileParserStrategy {
<<Protocol>>
+execute()
+get_work_items() Sequence~ParseWork~
+post_process()
}
class SimpleStatsStrategy {
+execute()
+get_work_items()
+post_process()
-_get_files()
-_map_variables()
}
class ConfigAwareStrategy {
+post_process()
-_parse_config()
}
SimpleStatsStrategy ..|> FileParserStrategy : implements
ConfigAwareStrategy --|> SimpleStatsStrategy
class StatType {
<<base>>
#_content: Any
#_repeat: int
+content
+reduced_content
+balance_content()
+reduce_duplicates()
+entries
}
class Scalar
class Vector {
+_entries: List~str~
}
class Distribution {
+minimum: float
+maximum: float
+statistics: bool
}
class Histogram {
+bins: int
+_entries: List~str~
}
class Configuration {
+onEmpty: str
}
Scalar --|> StatType
Vector --|> StatType
Distribution --|> StatType
Histogram --|> StatType
Configuration --|> StatType