Resources¶
The following resources are available for use:
cpg_flow.resources.gcp_machine_name
¶
gcp_machine_name(name, ncpu)
Machine type name in the GCP world
Source code in src/cpg_flow/resources.py
17 18 19 20 21 22 23 |
|
cpg_flow.resources.MachineType
dataclass
¶
MachineType(
name,
ncpu,
mem_gb_per_core,
price_per_hour,
disk_size_gb,
)
Hail Batch machine type on GCP
Source code in src/cpg_flow/resources.py
37 38 39 40 41 42 43 44 45 46 47 48 49 |
|
max_threads
¶
max_threads()
Number of available threads
Source code in src/cpg_flow/resources.py
51 52 53 54 55 |
|
calc_instance_disk_gb
¶
calc_instance_disk_gb()
The maximum available storage on an instance is calculated
in batch/batch/utils.py/unreserved_worker_data_disk_size_gib()
as the disk size (375G) minus reserved image size (30G) minus
reserved storage per core (5G*ncpu = 120G for a 32-core instance),
Source code in src/cpg_flow/resources.py
57 58 59 60 61 62 63 64 65 66 |
|
set_resources
¶
set_resources(
*,
j,
fraction=None,
ncpu=None,
nthreads=None,
mem_gb=None,
storage_gb=None
)
Set resources to a Job object. If any optional parameters are set, they will be used as a bound to request a fraction of an instance.
Source code in src/cpg_flow/resources.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
request_resources
¶
request_resources(
fraction=None,
ncpu=None,
nthreads=None,
mem_gb=None,
storage_gb=None,
)
Request resources from the machine, satisfying all provided requirements. If not requirements are provided, the minimal amount of cores (self.MIN_NCPU) will be used.
Source code in src/cpg_flow/resources.py
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
|
fraction_to_ncpu
¶
fraction_to_ncpu(fraction)
Converts fraction to the number of CPU (e.g. fraction=1.0 to take the entire machine, fraction=0.5 to take half of it, etc.).
Source code in src/cpg_flow/resources.py
122 123 124 125 126 127 128 |
|
mem_gb_to_ncpu
¶
mem_gb_to_ncpu(mem_gb)
Converts memory requirement to the number of CPU requirement.
Source code in src/cpg_flow/resources.py
130 131 132 133 134 135 |
|
storage_gb_to_ncpu
¶
storage_gb_to_ncpu(storage_gb)
Converts storage requirement to the number of CPU requirement.
We want to avoid attaching disks: attaching a disk to an existing instance
might fail with mkfs.ext4 ...
error, see:
https://batch.hail.populationgenomics.org.au/batches/7488/jobs/12
So this function will calculate the number of CPU to request so your jobs
can be packed to fit the default instance's available storage
(calculated with self.calc_instance_disk_gb()).
Source code in src/cpg_flow/resources.py
137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
|
nthreads_to_ncpu
¶
nthreads_to_ncpu(nthreads)
Convert number of threads into number of cores/CPU
Source code in src/cpg_flow/resources.py
152 153 154 155 156 |
|
adjust_ncpu
¶
adjust_ncpu(ncpu)
Adjust request number of CPU to a number allowed by Hail, i.e. the nearest power of 2, not less than the minimal number of cores allowed.
Source code in src/cpg_flow/resources.py
158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
|
cpg_flow.resources.STANDARD
module-attribute
¶
STANDARD = MachineType(
"standard",
ncpu=16,
mem_gb_per_core=3.75,
price_per_hour=1.0787,
disk_size_gb=375,
)
cpg_flow.resources.HIGHMEM
module-attribute
¶
HIGHMEM = MachineType(
"highmem",
ncpu=16,
mem_gb_per_core=6.5,
price_per_hour=1.3431,
disk_size_gb=375,
)
cpg_flow.resources.JobResource
dataclass
¶
JobResource(
machine_type, ncpu=None, attach_disk_storage_gb=None
)
Represents a fraction of a Hail Batch instance.
@param machine_type: Hail Batch machine pool type @param ncpu: number of CPU request. Will be used to calculate the fraction of the machine to take. If not set, all machine's CPUs will be used. @param attach_disk_storage_gb: if set to > MachineType.max_default_storage_gb, a larger disc will be attached by Hail Batch.
Source code in src/cpg_flow/resources.py
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
|
get_mem_gb
¶
get_mem_gb()
Memory resources in GB
Source code in src/cpg_flow/resources.py
238 239 240 241 242 |
|
java_mem_options
¶
java_mem_options(overhead_gb=1)
Returns -Xms -Xmx options to set Java JVM memory usage to use all the memory resources represented. @param overhead_gb: Amount of memory (in decimal GB) to leave available for other purposes.
Source code in src/cpg_flow/resources.py
244 245 246 247 248 249 250 251 252 253 254 255 |
|
java_gc_thread_options
¶
java_gc_thread_options(surplus=2)
Returns -XX options to set Java JVM garbage collection threading. @param surplus: Number of threads to leave available for other purposes.
Source code in src/cpg_flow/resources.py
257 258 259 260 261 262 263 |
|
get_ncpu
¶
get_ncpu()
Number of cores/CPU
Source code in src/cpg_flow/resources.py
265 266 267 268 269 |
|
get_nthreads
¶
get_nthreads()
Number of threads
Source code in src/cpg_flow/resources.py
271 272 273 274 275 |
|
get_storage_gb
¶
get_storage_gb()
Calculate storage in GB
Source code in src/cpg_flow/resources.py
277 278 279 280 281 282 283 284 285 286 287 288 289 |
|
set_to_job
¶
set_to_job(j)
Set the resources to a Job object. Return self to allow chaining, e.g.:
nthreads = STANDARD.request_resources(nthreads=4).set_to_job(j).get_nthreads()
Source code in src/cpg_flow/resources.py
291 292 293 294 295 296 297 298 299 300 301 302 |
|
cpg_flow.resources.storage_for_cram_qc_job
¶
storage_for_cram_qc_job()
Get storage request for a CRAM QC processing job, gb
Source code in src/cpg_flow/resources.py
305 306 307 308 309 310 311 312 313 314 315 |
|
cpg_flow.resources.joint_calling_scatter_count
¶
joint_calling_scatter_count(sequencing_group_count)
Number of partitions for joint-calling jobs (GenotypeGVCFs, VQSR, VEP), as a function of the sequencing group number.
Source code in src/cpg_flow/resources.py
318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 |
|
cpg_flow.resources.storage_for_joint_vcf
¶
storage_for_joint_vcf(
sequencing_group_count, site_only=True
)
Storage enough to fit and process a joint-called VCF
Source code in src/cpg_flow/resources.py
344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 |
|