SPS

The Semi-Permanent Storage (SPS) service is a distributed filesystem accessible from all the computing platform servers (interactive servers included). The allocated space is shared by all the users of a group and it is reachable through the path /sps/mygroup.

The default quota for a newly created group is 5 TiB. When requesting additional space, please make sure you follow the allocation criteria.

To know your group SPS utilisation without using the monitoring interface, you may run the Unix command df:

% df -h /sps/mygroup

or, to have more detailed information, (number of files, space used by a specific user…) run the command spsquotalist. Running the command without arguments will list some syntax examples:

% spsquotalist

spsquotalist usage examples:
...

Best practices

SPS service is a disk based storage intended for (but not limited to) files that are either:

  • too large for your home directory;

  • modified (very) often, id est for which the I/O pattern is mostly (small) writes, for instance log files;

  • used by many jobs, especially when the output of one job is the input of a job running later;

  • or when there is a large number of files (from tens of thousands to millions or even tens of millions).

Semi-permanent storage is not intended for permanent storage (for long-term storage, see section Mass storage): SPS disk servers use redundant and reliable storage, however they can fail badly and your data could then be permanently lost since, with a very few exceptions, there is no backup of the /sps disk spaces.

The data on SPS is accessed through POSIX calls in the same way as any local filesystem, using standard API functions and commands such as ls, cp etc.

Note

to submit a job that needs to access SPS, the user should specify it at submission time with the option -L sps.

SPS monitoring

Daily statistics about the usage of the SPS disk storage are provided by the service managers. Four kinds of statistics are available for every /sps disk space:

  • Per user space used and files statistics. The intent is to let users know whom among their group is using the disk space available to the group. This is only a per user summary, files list are not available. All files are taken into account including those in directories with restrictive access permissions.

  • Per top-level objects (files and directories) space used and files statistics. The intent is to let users know the amount of space used by the major parts of the filesystem. This is roughly similar to the output of the du command, used the following way:

    % cd /sps/mygroup && du -sh *
    

    For some groups with specific settings, the output may also include objects at a deeper level than the top-level objects (in such a case, these appear in italic and right justified in the first column of the table).

  • Per user cleaned up space statistics. This provides a quick view of the oldest files in a group suggesting cleanup targets.

  • Graphs about files statistics. This includes information about:

    • Files sizes distribution, for instance: “how many files sized between 5 KiB and 10 KiB are there? or how much space is used by all files with a size between 10 and 50 MiB?”

    • Last access time, for instance: “how many files have been accessed in the last 2 weeks? or how much space is used by files not accessed in the last 6 months?”. Accessing a file is reading at least one byte of it.

    • File freshness: this is a rough measure of the frequency and type of access to files and of the life expectancy of files. This is the lapse of time between the last modification time of a file (last write) and the last access time to that file (last read).

    • File types: number of files and space used by file type (for instance ROOT files, log files, etc.) The type of a file is guessed from the filename suffix (extension), for example myjob.log is likely to be a log file.

For any issues with this interface, please first browse the dedicated FAQ.

You may find these informations grouped in the SPS Monitoring homepage.