wsiprocess package

wsiprocess.annotation

Annotation object.

Annotation object is optional metadata for slide. This object can handle ASAP or WSIViewer style annotation. By adding annotationparser, you can process annotation data from other types of annotation tools.

Example

Loading annotation data:: python

import wsiprocess as wp annotation = wp.annotation(“path_to_annotation_file.xml”)

Loading annotation data from image:: python

import wsiprocess as wp annotation = wp.annotation(“”)
class wsiprocess.annotation.Annotation(path, is_image=False, slidename=False)
add_class(classes)
base_mask(cls, wsi_height, wsi_width)

Masks have same size of as the slide.

Masks are canvases of 0s.

Parameters:
  • cls (str) – Class name for each mask.
  • wsi_height (int) – The height of base masks.
  • wsi_width (int) – The width of base masks.
base_masks(wsi_height, wsi_width)

Make base masks.

Parameters:
  • wsi_height (int) – The height of base masks.
  • wsi_width (int) – The width of base masks.
dot_to_bbox(width=30, height=False)

Translate dot annotations to bounding boxes.

If the len(self.mask_coords[cls][idx]) is 1, the annotation is a dot. And, the dot is the midpoint of the bounding box.

Parameters:
  • width (int) – Width of the translated bounding box.
  • height (int) – Height of the translated bounding box. If not set, height is equal to width.
exclude_coords(rule)

Exclude coordinations following the rule.

Parameters:rule (wsiprocess.rule.Rule) – Rule object.
exclude_masks(rule)

Exclude area from base mask with following the rule.

Parameters:rule (wsiprocess.rule.Rule) – Rule object.
export_mask(save_to, cls)

Export one binary mask image.

Export mask image with 0 or 1 binaries.

Parameters:
  • save_to (str) – Parent directory to save the thumbnails.
  • cls (str) – Class name for each mask.
export_masks(save_to)

Export binary mask images.

For later computing such as segmentation, export the mask images. Exported masks have 0 or 1 binary data.

Parameters:save_to (str) – Parent directory to save the thumbnails.
export_thumb_mask(cls, save_to='.', size=512)

Export a thumbnail of one of the masks.

For prior check, export one thumbnail of one of the masks.

Parameters:
  • cls (str) – Class name for each mask.
  • save_to (str, optional) – Parent directory to save the thumbnails.
  • size (int, optional) – Length of the long side of thumbnail.
export_thumb_masks(save_to='.', size=512)

Export thumbnail of masks.

For prior check, export thumbnails of masks.

Parameters:
  • save_to (str) – Parent directory to save the thumbnails.
  • size (int) – Length of the long side of thumbnail.
from_image(mask, cls)

Load mask data from an image.

Parameters:
  • mask (numpy.ndarray) – 2D mask image with background as 0, and foreground as 255.
  • cls (str) – Name of the class of the mask image.
include_masks(rule)

Merge masks following the rule.

Parameters:rule (wsiprocess.rule.Rule) – Rule object.
main_masks()

Main masks

Write border lines following the rule and fill inside with 255.

make_foreground_mask(slide, size=2000, method='otsu', min_=30, max_=190)

Make foreground mask.

With otsu thresholding, make simple foreground mask.

Parameters:
  • slide (wsiprocess.slide.Slide) – Slide object.
  • size (int, or function, optional) – Size of foreground mask on calculating with the Otsu Thresholding.
  • method (str, optional) – Binarization method. As default, calculates with Otsu Thresholding.
  • min (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
  • max (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
make_masks(slide, rule=False, foreground='otsu', size=2000, min_=30, max_=190)

Make masks from the slide and rule.

Masks are for each class and foreground area.

Parameters:
  • slide (wsiprocess.slide.Slide) – Slide object
  • rule (wsiprocess.rule.Rule, optional) – Rule object
  • foreground (str, optional) – This can be {otsu, minmax}. If not set, Annotation don’t make foreground mask.
  • size (int, optional) – Size of foreground mask on calculating with the Otsu Thresholding.
  • method (str, optional) – Binarization method. As default, calculates with Otsu Thresholding.
  • min (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
  • max (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
merge_include_coords(rule)

Merge coordinations following the rule.

Parameters:rule (wsiprocess.rule.Rule) – Rule object.
read_annotation(annotation_type=False)

Parse the annotation data.

Parameters:annotation_type (str) – If provided, pass the auto type detection.

wsiprocess.converter

Convert wsiprocess style annotation data to COCO or VOC style.

class wsiprocess.converter.Converter(root, save_to, ratio_arg)

Converter Class Args:

Attributes:

to_coco()
to_voc()
to_yolo()

wsiprocess.error

Custom errors for wsiprocess.

exception wsiprocess.error.AnnotationLabelError(message)

Error of annotations

Parameters:message (str) – Message to show in the stdout.
exception wsiprocess.error.MissCombinationError(message)

Error of the combination of the method and the anntoation file.

Args
message (str): Message to show in the stdout.
exception wsiprocess.error.OnParamError(message)

Error of on_annotation.

on_annotation must be more than 0 and up to 1.

Parameters:message (str) – Message to show in the stdout.
exception wsiprocess.error.PatchSizeTooSmallError(message)

Error of the size of patches.

This should be warning?

Parameters:message (str) – Message to show in the stdout.
exception wsiprocess.error.SizeError(message)

Error of sizes.

The slide size is larger than the patches, and the patch size is larger than the overlap size.

Parameters:message (str) – Message to show in the stdout.
exception wsiprocess.error.SlideLoadError(message)

Error on loading slides.

Parameters:message (str) – Message to show in the stdout.
exception wsiprocess.error.WsiProcessError(message)

Root error class.

Parameters:message (str) – Message to show in the stdout.

wsiprocess.patcher

Patcher object to extract patches from whole slide images.

class wsiprocess.patcher.Patcher(slide, method, annotation=False, save_to='.', patch_width=256, patch_height=256, overlap_width=0, overlap_height=0, offset_x=0, offset_y=0, on_foreground=0.5, on_annotation=0.5, start_sample=False, finished_sample=False, no_patches=False, crop_bbox=False)

Patcher object.

Parameters:
  • slide (wsiprocess.slide.Slide) – Slide object.
  • method (str) – Method name to run. One of {“none”, “classification”, “detection”, “segmentation}. Characters are converted to lowercase.
  • annotation (wsiprocess.annotation.Annotation, optional) – Annotation object.
  • save_to (str, optional) – The root of the output directory.
  • patch_width (int, optional) – The width of the output patches.
  • patch_height (int, optional) – The height of the output patches.
  • overlap_width (int, optional) – The width of the overlap areas of patches.
  • overlap_height (int, optional) – The height of the overlap areas of patches.
  • offset_x (int, optional) – The offset pixels along the x-axis.
  • offset_y (int, optional) – The offset pixels along the y-axis.
  • on_foreground (float, optional) – Ratio of overlap area between patches and foreground area.
  • on_annotation (float, optional) – Ratio of overlap area between patches and annotation.
  • start_sample (bool, optional) – Whether to save sample patches on Patcher starting.
  • finished_sample (bool, optional) – Whether to save sample patches on Patcher finished its work.
  • extract_patches (bool, optional) – This is deprecated because unless “no_patches” is set, Patcher extracts patches.
  • no_patches (bool, optional) – If set, Patcher runs without extracting patches and saves them to disk.
slide

Slide object.

Type:wsiprocess.slide.Slide
wsi_width

Width of the slide.

Type:int
wsi_height

Height of the slide.

Type:int
filepath

Path to the whole slide image.

Type:str
filestem

Stem of the file name.

Type:str
method

Method name to run. One of {“none”, “classification”, “detection”, “segmentation}

Type:str
annotation

Annotation object.

Type:wsiprocess.annotation.Annotation
masks

Masks to show the location of classes.

Type:dict
classes

Classes to extract.

Type:list
save_to

The root of the output directory.

Type:str
p_width

The width of the output patches.

Type:int
p_height

The height of the output patches.

Type:int
p_area

The area of single patch.

Type:int
o_width

The width of the overlap areas of patches.

Type:int
o_height

The height of the overlap areas of patches.

Type:int
offset_x

The The offset pixels along the x-axis.

Type:int
offset_y

The The offset pixels along the y-axis.

Type:int
on_foreground

Ratio of overlap area between patches and foreground area.

Type:float
on_annotation

Ratio of overlap area between patches and annotation.

Type:float
start_sample

Whether to save sample patches on Patcher start.

Type:bool
finished_sample

Whether to save sample patches on Patcher finish.

Type:bool
extract_patches

Whether to save patches when Patcher runs.

Type:bool
no_patches

Whether to save patches when Patcher runs.

Type:bool
x_lefttop

Offsets of patches to the x-axis direction except for the right edge.

Type:list
y_lefttop

Offsets of patches to the y-axis direction except for the bottom edge.

Type:list
iterator

Offset coordinates of patches.

Type:list
last_x

X-axis offset of the right edge patch.

Type:int
last_y

Y-axis offset of the right edge patch.

Type:int
result

Temporary storage for the computed result of patches.

Type:dict
annotation_cover_patch(coords, x, y)

Check if the annotation is covering the whole patch.

Parameters:
  • coords (np.array) – Coordinations of annotations.
  • px (int) – X coordinate of left top corner of the patch.
  • py (int) – Y coordinate of left top corner of the patch.
Returns:

List of np.int64s which are the indices

of bounding boxes on the patch.

Return type:

idx_of_bb_on_patch (list)

corner_on_patch(coords, x, y)

Check if at least one of the corners is on the patch.

Parameters:
  • coords (np.array) – Coordinations of annotations.
  • px (int) – X coordinate of left top corner of the patch.
  • py (int) – Y coordinate of left top corner of the patch.
Returns:

List of np.int64s which are the indices

of bounding boxes on the patch.

Return type:

idx_of_bb_on_patch (list)

find_bbs(x, y, cls)

Find bounding boxes which are on the patch.

Bounding boxes with one of its corners on the patch is on the patch.
ex : annotation.mask_coords[“benign”][0]
= [small_x, small_y, large_x, large_y] = [bbleft, bbtop, bbright, bbbottom]
Parameters:
  • x (int) – X-axis offset of patch.
  • y (int) – Y-axis offset of patch.
  • cls (str) – Class of the patch or the bounding box or the segmented area.
find_masks(x, y, cls)

Get the masked area corresponding to the given patch area.

Parameters:
  • x (int) – X-axis offset of a patch.
  • y (int) – Y-axis offset of a patch.
  • cls (str) – Class of the patch or the bounding box or the segmented area.
Returns:

List containing a dict of coords and its class. This

coords is a path to the png image.

Return type:

masks (list)

get_mini_patch_parallel(classes=False)
get_patch(x, y, classes=False)

Extract a single patch.

Parameters:
  • x (int) – X-axis offset of a patch.
  • y (int) – Y-axis offset of a patch.
  • classes (list) – For the case of method is classification, extract the patch for multiple times if the patch is on the border of two or more classes. To prevent patcher to extract a single patch for multiple classes, on_annotation=1.0 should work.
get_patch_parallel(classes=False, cores=-1)

Run get_patch() in parallel.

Parameters:
  • classes (list) – Classes to extract.
  • cores (int) – Threads to run. -1 means same as the number of cores.
get_random_sample(phase, sample_count=1)

Get random patch to check if the patcher can work properly.

Parameters:
  • phase (str) – When to check. One of {start, finish}
  • sample_count (int) – Number of patches to extract.
patch_on_annotation(cls, x, y)

Check if the patch is on the annotation area of a class.

Parameters:
  • cls (str) – Class of the patch or the bounding box or the segmented area.
  • x (int) – X-axis offset of a patch.
  • y (int) – Y-axis offset of a patch.
Returns:

Whether the patch is on the anntation.

Return type:

(bool)

patch_on_foreground(x, y)

Check if the patch is on the foreground area.

Parameters:
  • x (int) – X-axis offset of a patch.
  • y (int) – Y-axis offset of a patch.
Returns:

Whether the patch is on the foreground area.

Return type:

(bool)

remove_dup_in_results()

Remove duplicate results in self.result[“results”]

save_patch_result(x, y, cls)

Save the extracted patch data to result

Parameters:
  • x (int) – X-axis offset of patch.
  • y (int) – Y-axis offset of patch.
  • cls (str) – Class of the patch or the bounding box or the segmented area.
save_results()

Save the extraction results.

Saves some metadata with the patches results.

side_on_patch(coords, x, y)

Check if at least one of the side is on the patch.

Parameters:
  • coords (np.array) – Coordinations of annotations.
  • px (int) – X coordinate of left top corner of the patch.
  • py (int) – Y coordinate of left top corner of the patch.
Returns:

List of np.int64s which are the indices

of bounding boxes on the patch.

Return type:

idx_of_bb_on_patch (list)

to_bb(coord)

Convert coordinates to voc coordinates.

Parameters:coord (list) –

List of coordinates stored as below:

[[xOfOneCorner, yOfOneCorner],
 [xOfApex,      yOfApex]]
Returns:List of coordinates stored as below:
[[xmin, ymin],
 [xmin, ymax],
 [xmax, ymax],
 [xmax, ymin]]
Return type:outer_coord (list)

wsiprocess.rule

Object to define rules for extracting patches.

Rule file should be a json file. The content of rule.json is like below.

Example

Json data below defines

  • extract the patches of benign and malignant
  • benign includes stroma but excludes malignant or uncertain
  • malignant means malignant itself but excludes benign
{
    "benign" : {
        "includes" : [
            "stroma"
        ],
        "excludes : [
            "malignant",
            "uncertain"
        ]
    },
    "malignant": {
        "includes" : [
        ],
        "excludes" :[
            "benign"
        ]
    }
}
class wsiprocess.rule.Rule(path)

Base class for rule.

Parameters:path (str) – Path to the rule.json file.
classes

List of the classes. i.e. [“benign”, “malignant”]

Type:list
read_rule

Dict of rules. i.e. {“benign”: {“inclues”: [“stroma”]}}

Type:dict
read_rule()

Read the rule file.

Parse the rule file and save as the classes.

wsiprocess.slide

Slide object to pass to annotation object and patcher object. Slide is whole slide image, scanned with whole slide scanners. Mannually you can make pyramidical tiff file, which you can handle just the same as the scanned digital data, except for the magnification.

class wsiprocess.slide.Slide(path)

Slide object.

Parameters:path (str) – Path to the whole slide image file.
path

Path to the whole slide image file.

Type:str
slide

pyvips Image object.

Type:pyvips.Image
wsi_width

Width of slide.

Type:int
wsi_height

Height of slide.

Type:int
export_thumbnail(save_as='./thumb.png', size=500)

Export thumbnail image.

Parameters:
  • save_as (str) – Path to save as the thumbnail image.
  • size (int, optional) – Size of the exported thumbnail.
get_thumbnail(size=500)

Get thumbnail image.

Parameters:size (int, optional) – Size of the exported thumbnail.
set_properties()

Read the properties and set as attributes of slide obj.

magnification

Objective power of slide obj.

Type:int

wsiprocess.utils

wsiprocess.utils.show_bounding_box(patch_path, result_path, save_as)
wsiprocess.utils.show_mask_on_patch(patch_path, mask_path, save_as)

wsiprocess.verify

Verification script runs before the patcher works. Verify class works for verification of the output directory, annotation files, rule files, etc. Mainly runs for cli.

class wsiprocess.verify.Verify(save_to, filestem, method, start_sample, finished_sample, no_patches, crop_bbox)

Verification class.

Parameters:
  • save_to (str) – The root of the output directory.
  • filestem (str) – The name of the output directory.
  • method (str) – Method name to run. One of {“none”, “classification”, “detection”, “segmentation}
  • start_sample (bool) – Whether to save sample patches on Patcher start.
  • finished_sample (bool) – Whether to save sample patches on Patcher finish.
  • extract_patches (bool) – [Deleted]Whether to save patches when Patcher runs.
  • no_patches (bool) – Whether to save patches when Patcher runs.
save_to

The root of the output directory.

Type:str
filestem

The name of the output directory.

Type:str
method

Method name to run. One of {“none”, “classification”, “detection”, “segmentation}

Type:str
start_sample

Whether to save sample patches on Patcher start.

Type:bool
finished_sample

Whether to save sample patches on Patcher finish.

Type:bool
extract_patches

[Deleted]Whether to save patches when Patcher runs.

Type:bool
no_patches

Whether to save patches when Patcher runs.

Type:bool
magnification(slide, magnification)

Check if the slide has data for the magnification the user specified.

Parameters:
  • slide (wp.slide.Slide) – Slide object to check.
  • magnification (int) – Target magnification value which has to be smaller than the magnification of the slide.
static make_dir(path)

Make output directory.

make_dirs()

Ensure the output directories exists for each tasks.

on_params(on_annotation, on_foreground)

Verify the ratio of on_annotation.

Parameters:
  • on_annotation (float) – Overlap ratio of patches and annotations.
  • on_foreground (float) – Overlap ratio of patches and foreground area
Raises:

wsiprocess.error.SizeError – If the sizes are invalid.

sizes(wsi_width, wsi_height, offset_x, offset_y, patch_width, patch_height, overlap_width, overlap_height, dot_bbox_width, dot_bbox_height)

Verify the sizes of the slide, the patch and the overlap area.

Raises:wsiprocess.error.SizeError – If the sizes are invalid.