pleiades.utils.files module
File utilities for PLEIADES neutron imaging data processing.
This module provides utilities for file discovery, metadata extraction, and data export operations. It includes functions for finding image files with dominant extensions, extracting timing information from filenames, and exporting processed data to ASCII format.
The module supports: - Automatic file discovery with extension filtering - Filename-based metadata extraction for neutron imaging files - ASCII data export with proper formatting - Robust error handling for file operations
Example
Basic file discovery and export:
>>> files, ext = retrieve_list_of_most_dominant_extension_from_folder("/path/to/data")
>>> print(f"Found {len(files)} {ext} files")
>>>
>>> data_dict = {"energy": [1, 2, 3], "transmission": [0.8, 0.6, 0.4]}
>>> export_ascii(data_dict, "output.txt")
- pleiades.utils.files.retrieve_list_of_most_dominant_extension_from_folder(folder: str = '', files: List[str] = None) Tuple[List[str], str][source]
Find and return files with the most common extension from a folder or file list.
Analyzes a folder or list of files to determine the most frequently occurring file extension, then returns all files with that extension. This is useful for automatically detecting the primary data format in imaging directories.
- Parameters:
- Returns:
- A tuple containing:
List of absolute file paths with the dominant extension, sorted alphabetically
The dominant file extension (e.g., ‘.tiff’, ‘.fits’)
- Return type:
Example
From folder: >>> files, ext = retrieve_list_of_most_dominant_extension_from_folder(“/path/to/data”) >>> print(f”Found {len(files)} files with extension {ext}”) Found 100 files with extension .tiff
From file list: >>> file_list = [“/path/file1.tiff”, “/path/file2.tiff”, “/path/file3.fits”] >>> files, ext = retrieve_list_of_most_dominant_extension_from_folder(files=file_list) >>> ext ‘.tiff’
Note
If folder is provided, it takes precedence over files parameter
Files are returned as absolute paths and sorted alphabetically
Extension counting is case-sensitive
Hidden files (starting with ‘.’) are included in the search
- Raises:
FileNotFoundError – If folder doesn’t exist
ValueError – If no files are found or all files lack extensions
- pleiades.utils.files.retrieve_number_of_frames_from_file_name(file_name: str) int[source]
Extract the number of time-of-flight frames from a neutron imaging filename.
Parses specially formatted filenames to extract the number of time frames. The expected format includes ‘T’ followed by the frame count, then ‘p’. This is commonly used in neutron imaging file naming conventions.
- Parameters:
file_name (str) – Filename containing frame information in the format ‘…T{frame_count}p…’. Example: ‘image_m2M9997Ex512y512t1e6T2000p1e6P100.tiff’
- Returns:
Number of time-of-flight frames extracted from the filename
- Return type:
Example
>>> filename = "image_m2M9997Ex512y512t1e6T2000p1e6P100.tiff" >>> frames = retrieve_number_of_frames_from_file_name(filename) >>> frames 2000
>>> filename = "data_T500p.fits" >>> frames = retrieve_number_of_frames_from_file_name(filename) >>> frames 500
- Raises:
ValueError – If the filename doesn’t contain required ‘T’ and ‘p’ markers
ValueError – If the extracted value cannot be converted to an integer
Note
The function looks for the pattern ‘T{number}p’ in the filename
Only the basename of the file is considered (path is ignored)
The number must be a valid integer
- pleiades.utils.files.retrieve_time_bin_size_from_file_name(file_name: str) float[source]
Extract the time bin size from a neutron imaging filename.
Parses specially formatted filenames to extract the time bin size used for time-of-flight measurements. The expected format includes ‘t’ followed by the bin size, then ‘T’. Handles scientific notation with automatic correction for common formatting issues.
- Parameters:
file_name (str) – Filename containing time bin information in the format ‘…t{bin_size}T…’. Example: ‘image_m2M9997Ex512y512t1e6T2000p1e6P100.tiff’ Scientific notation like ‘1e6’ is supported and corrected to ‘1e-6’.
- Returns:
Time bin size in seconds (typically microseconds as 1e-6)
- Return type:
Example
>>> filename = "image_m2M9997Ex512y512t1e6T2000p1e6P100.tiff" >>> bin_size = retrieve_time_bin_size_from_file_name(filename) >>> bin_size 1e-06
>>> filename = "data_t0.001T500p.fits" >>> bin_size = retrieve_time_bin_size_from_file_name(filename) >>> bin_size 0.001
- Raises:
ValueError – If the filename doesn’t contain required ‘t’ and ‘T’ markers
ValueError – If the extracted value cannot be converted to a float
Note
The function looks for the pattern ‘t{number}T’ in the filename
Automatically corrects ‘e’ to ‘e-’ in scientific notation (common formatting)
Only the basename of the file is considered (path is ignored)
Supports both decimal and scientific notation
- pleiades.utils.files.export_ascii(data_dict: Dict[str, List | Any], file_path: str) None[source]
Export processed data to a tab-separated ASCII file.
Converts a dictionary of data arrays to a formatted ASCII file suitable for analysis in external tools. The output uses tab separation with column headers for easy import into spreadsheet or analysis software.
- Parameters:
Example
Basic export: >>> data = { … “energy_eV”: [1.0, 2.0, 3.0], … “transmission”: [0.8, 0.6, 0.4], … “uncertainties”: [0.1, 0.08, 0.06] … } >>> export_ascii(data, “transmission_results.txt”) Data exported to transmission_results.txt
Output file format: energy_eV transmission uncertainties 1.0 0.8 0.1 2.0 0.6 0.08 3.0 0.4 0.06
- Raises:
ValueError – If data_dict is empty or contains mismatched array lengths
IOError – If file cannot be written (permissions, disk space, etc.)
KeyError – If data_dict contains invalid data types
Note
Uses tab separation for easy import into analysis software
Includes column headers in the first row
Creates parent directories if they don’t exist
Overwrites existing files without warning
All data columns must have the same length