Metadata-Version: 2.1
Name: busybee
Version: 1.0.0
Summary: Simple and interactive multi-processing for Python and notebooks
Home-page: https://github.com/lambdapioneer/busybee
Author: Daniel H
Author-email: not_provided@example.org
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: License :: OSI Approved :: MIT License
Classifier: Development Status :: 5 - Production/Stable
Requires-Python: >=3.6
Description-Content-Type: text/markdown

# BusyBee 🐝

 - Simple, interactive multiprocessing for slow I/O and calculation in your notebook
 - Simple to use as a drop-in replacement for the standard `map` function
 - Prints the current progress and remaining time estimate
 - No external dependencies and 100% test coverage

The PyPI project page is here: https://pypi.org/project/busybee/

[![CircleCI](https://circleci.com/gh/lambdapioneer/busybee.svg?style=svg)](https://circleci.com/gh/lambdapioneer/busybee)

## Quick start

Install the BusyBee module via `pip` and use it as a replacement for your current `map` function. As BusyBee needs to know to total number of items the data must expose its length to `len()` calls. The best approach is to provide it as a `list`.

**Install:**

```bash
$ pip3 install busybee
```

**Code:**

```python
import busybee
result = busybee.map(func, data)
```

**Output:**

```C
BusyBee: Start processing 42 items with 8 processes...
BusyBee:  1/42,  2.4% (avg: 3.2s cpu, rem: 16.5s)
BusyBee: 15/42, 35.7% (avg: 2.4s cpu, rem: 8.1s)
BusyBee: 21/42, 50.0% (avg: 2.5s cpu, rem: 6.5s)
BusyBee: 24/42, 57.1% (avg: 2.6s cpu, rem: 5.8s)
BusyBee: 34/42, 81.0% (avg: 2.5s cpu, rem: 2.5s)
BusyBee: Finished processing 42 items in 16.1s (avg: 2.6s cpu)
```

## Advanced usage 👩‍💻 👨‍💻

You can configure the amount of cores to be used using the `processes` argument. For this you can either provide a number (e.g. 1, 8) or a simple formula such as `n/2` or `n-1`. The `n` refers to the logical number of CPU cores returned by the `multiprocessing` module.

Further, you can configure the output by providing a custom `stdout` sink and configuring how often you want to receive an update. You can do so by using the arguments `update_every_n_seconds` (default: 10) and `update_every_n_percent` (default: 50).

If you do not want any output, just set `quiet=False`.

**Example:**


```python
import busybee
result = busybee.map(
    func, data
    processes='n-1',
    update_every_n_seconds=10,
    update_every_n_percent=25,
)
```

## Q&A 🤔

**Why did you built it? And why shouldn't I just use the `multiprocessing` module?**

I started building BusyBee when I was working with a lot of I/O and pre-processing in Python Notebooks. Parallelizing these cells made it much faster, but it was often more involved than a one-line change. For instance, the worker pool needs manual clean-up and leaks a substantial amount of memory otherwise.

Also, it was hard to predict the remaining time and whether it was worth to avoid context switching or actually making some tea/coffee ☕.

**Is there more than the `map(func, data, ...)` function?**

Yes! While the `map` function is the most universal, there are more: The `filter(func, data, ...)` functions works similar to the regular `filter` function applied on lists. The `mk_dict(func, keys, ...)` function resembles the dictionary compression syntax `{k: func(k) for k in keys)`.

**I want a different output!**

I want to allow choosing from certain output styles. This is on my roadmap, but I do not have any certain date in mind. To maintain the simplicity I do not envision supporting custom output formatting. However, I am happy to be convinced otherwise.

## Contribute 👋

Awesome that you are interested in improving this code! When contributing, please follow the following (common-sense) steps:

 - Create an issue before you write any code. This allows to guide you in the right direction.
    - If you are after a simple 1-5 line fix, you might ignore this.
 - In the pull-request explain the high-level goal and your approach. That provides valuable context.
 - Convince others (and yourself) that the change is safe and sound.
    - Run `python3 -m unittest tests/test*` after you added test cases for your changes
    - Run `coverage3 run --source busybee setup.py test && coverage3 report` to ensure that the code is actually fully covered

## Reference/BibTex 📚

If you want to reference BusyBee in documentation or articles, feel free to use this suggested BibTex snippet:

```
@misc{hugenroth2020busybee,
  author={{Daniel Hugenroth}},
  title={BusyBee Python Software Library},
  year={2020},
  url={https://github.com/lambdapioneer/busybee},
}
```


