Maps (operations on sequences)
utilz.maps
The maps module is designed to be used with sequences and has several generalizations of functions in utilz.ops
that work with sequences. Here are the parallels:
map function (s) | op function(s) | description |
---|---|---|
map |
do |
apply one function |
mapcompose |
pipe /do(compose()) |
apply multiple functions in sequence |
mapmany |
many |
apply multiple functions in parallel |
mapif |
iffy |
apply one function if a predicate function otherwise noop |
mapacross |
None |
apply multiple functions to multiple inputs in pairs |
mapcat |
None |
apply one multi-output function and flatten the results |
mapwith |
None |
map a two argument function to an iterable and a fixed arg or two iterables |
All members of the map
family, expect an iterable as their last argument, each
element of which is passed to functions as their first argument. Except for
mapcat
, all map*
functions return a sequence the same length as the input they received.
check_random_state(seed=None)
Turn seed into a np.random.RandomState instance. Note: credit for this code goes entirely to sklearn.utils.check_random_state
. Using the source here simply avoids an unecessary dependency.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
seed |
None, int, np.RandomState
|
if seed is None, return the RandomState singleton used by np.random. If seed is an int, return a new RandomState instance seeded with seed. If seed is already a RandomState instance, return it. Otherwise raise ValueError. |
None
|
Source code in utilz/maps.py
183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 |
|
filter(how, iterme, invert=False, substr_match=True, assert_notempty=True)
Filter an iterable and concatenate the output to a list instead of a generator like
the standard filter
in python. By default always returns the matching elements
from iterme. This can be inverted using invert=True or split using invert='split'
which will return matches, nomatches. Filtering can be done by passing a function, a
single str/int/float
, or an iterable of str/int/float
. Filtering by an iterable
checks if any
of the values in the iterable exist in each item of iterme
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func |
Union[Callable, Iterable, str, int, float]
|
if a function is passed it |
required |
iterme |
Iterable
|
iterable to filter |
required |
invert |
bool/str optional
|
if |
False
|
assert_notempty |
bool
|
raise an error if the returned output is |
True
|
Returns:
Name | Type | Description |
---|---|---|
list | filtered version of `iterme |
Source code in utilz/maps.py
396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 |
|
map(func, iterme, **kwargs)
Super-power your for
loops with a progress-bar and optional reproducible
parallelization!
maps func
to iterme
. Includes a progress-bar powered by tqdm
.
Supports parallelization with jobllib.Parallel
multi-processing by setting n_jobs > 1
. Progress-bar accurately tracks parallel jobs!
iterme
can be a list of elements, list of DataFrames, list of arrays, or list of
lists. List of lists up to 2 deep will be flattened to single list when func = None
See the examples below for interesting use cases beyond standard looping!
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func |
callable
|
function to map |
required |
iterme |
iterable
|
an iterable for which each element will be passed to func |
required |
enum |
bool
|
whether the value of the current iteration should be passed to |
required |
random_state |
bool/int
|
whether a randomly initialized seed should be |
required |
n_jobs |
int
|
number of cpus/threads; Default 1 (no parallel) |
required |
backend |
str
|
Only applies if |
required |
pbar |
bool
|
whether to use tqdm to sfunc a progressbar; Default |
required |
verbose |
int
|
|
required |
**kwargs |
dict
|
optional keyword arguments to pass to func |
{}
|
Examples:
>>> # Just like map
>>> out = map(lambda x: x * 2, [1, 2, 3, 4])
>>> # Concatenating nested lists
>>> data = [[1, 2], [3, 4]]
>>> out = mapcat(None, data)
>>> # Load multiple files into a single dataframe
>>> out = mapcat(pd.read_csv, ["file1.txt", "file2.txt", "file3.txt"])
>>> # Parallelization with randomness
>>> def f_random(x, random_state=None):
>>> random_state = check_random_state(random_state)
>>> sleep(0.5)
>>> # Use the random state's number generator rather than np.random
>>> return x + random_state.rand()
>>>
>>> # Now set a random_state in mapcat to reproduce the parallel runs
>>> # It doesn't pass the value, but rather generates a reproducible list
>>> # of seeds that are passed to each function execution
>>> out = mapcat(f_random, [1, 1, 1, 1, 1], n_jobs=2, random_state=1)
Source code in utilz/maps.py
203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 |
|
mapacross(*args)
Map multiple functions to an iterable in a matched-pair fasion. The number of funcions needs to equal the length of the iterable.
Source code in utilz/maps.py
325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 |
|
mapcat(func, iterme, **kwargs)
Call map and concatenate results after. Particularly useful to ensure results are numpy arrays
Source code in utilz/maps.py
313 314 315 316 317 318 319 320 321 322 |
|
mapcompose(*args, **kwargs)
Compose multiple functions together and map them over a sequences, i.e. a mini-pipe per element. Returns a list the same length as the input iterable containing the final function evaluation for each element.
Source code in utilz/maps.py
366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 |
|
mapif(func, predicate_func, iterme, **kwargs)
Apply func to each element of iterme if predicate_func is True for that element otherwise return the element
Source code in utilz/maps.py
388 389 390 391 392 393 |
|
mapmany(*args, **kwargs)
Map multiple functions separately to each element in an iterable. Returns a list of nested lists containing the output of each function evaluation on each element in iterme
Source code in utilz/maps.py
344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 |
|
mapwith(func, iterwith, iterme, **kwargs)
Just like map but accepts a second arg that can also be an iterator. In a pipe
iterme is always the last input to mapwith, but the first input func. If
copy=True
is passed and iterwith is not an iterator, an iterator is built with
guaranteed copies of iterwith.
Source code in utilz/maps.py
474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 |
|