wsiprocess package¶
wsiprocess.annotation¶
Annotation object.
Annotation object is optional metadata for slide. This object can handle ASAP or WSIViewer style annotation. By adding annotationparser, you can process annotation data from other types of annotation tools.
Example
Loading annotation data:: python
import wsiprocess as wp annotation = wp.annotation(“path_to_annotation_file.xml”)
Loading annotation data from image:: python
import wsiprocess as wp annotation = wp.annotation(“”)
-
class
wsiprocess.annotation.
Annotation
(path, is_image=False, slidename=False)¶ -
add_class
(classes)¶
-
base_mask
(cls, wsi_height, wsi_width)¶ Masks have same size of as the slide.
Masks are canvases of 0s.
Parameters: - cls (str) – Class name for each mask.
- wsi_height (int) – The height of base masks.
- wsi_width (int) – The width of base masks.
-
base_masks
(wsi_height, wsi_width)¶ Make base masks.
Parameters: - wsi_height (int) – The height of base masks.
- wsi_width (int) – The width of base masks.
-
dot_to_bbox
(width=30, height=False)¶ Translate dot annotations to bounding boxes.
If the len(self.mask_coords[cls][idx]) is 1, the annotation is a dot. And, the dot is the midpoint of the bounding box.
Parameters: - width (int) – Width of the translated bounding box.
- height (int) – Height of the translated bounding box. If not set, height is equal to width.
-
exclude_coords
(rule)¶ Exclude coordinations following the rule.
Parameters: rule (wsiprocess.rule.Rule) – Rule object.
-
exclude_masks
(rule)¶ Exclude area from base mask with following the rule.
Parameters: rule (wsiprocess.rule.Rule) – Rule object.
-
export_mask
(save_to, cls)¶ Export one binary mask image.
Export mask image with 0 or 1 binaries.
Parameters: - save_to (str) – Parent directory to save the thumbnails.
- cls (str) – Class name for each mask.
-
export_masks
(save_to)¶ Export binary mask images.
For later computing such as segmentation, export the mask images. Exported masks have 0 or 1 binary data.
Parameters: save_to (str) – Parent directory to save the thumbnails.
-
export_thumb_mask
(cls, save_to='.', size=512)¶ Export a thumbnail of one of the masks.
For prior check, export one thumbnail of one of the masks.
Parameters: - cls (str) – Class name for each mask.
- save_to (str, optional) – Parent directory to save the thumbnails.
- size (int, optional) – Length of the long side of thumbnail.
-
export_thumb_masks
(save_to='.', size=512)¶ Export thumbnail of masks.
For prior check, export thumbnails of masks.
Parameters: - save_to (str) – Parent directory to save the thumbnails.
- size (int) – Length of the long side of thumbnail.
-
from_image
(mask, cls)¶ Load mask data from an image.
Parameters: - mask (numpy.ndarray) – 2D mask image with background as 0, and foreground as 255.
- cls (str) – Name of the class of the mask image.
-
include_masks
(rule)¶ Merge masks following the rule.
Parameters: rule (wsiprocess.rule.Rule) – Rule object.
-
main_masks
()¶ Main masks
Write border lines following the rule and fill inside with 255.
-
make_foreground_mask
(slide, size=2000, method='otsu', min_=30, max_=190)¶ Make foreground mask.
With otsu thresholding, make simple foreground mask.
Parameters: - slide (wsiprocess.slide.Slide) – Slide object.
- size (int, or function, optional) – Size of foreground mask on calculating with the Otsu Thresholding.
- method (str, optional) – Binarization method. As default, calculates with Otsu Thresholding.
- min (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
- max (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
-
make_masks
(slide, rule=False, foreground='otsu', size=2000, min_=30, max_=190)¶ Make masks from the slide and rule.
Masks are for each class and foreground area.
Parameters: - slide (wsiprocess.slide.Slide) – Slide object
- rule (
wsiprocess.rule.Rule
, optional) – Rule object - foreground (str, optional) – This can be {otsu, minmax}. If not set, Annotation don’t make foreground mask.
- size (int, optional) – Size of foreground mask on calculating with the Otsu Thresholding.
- method (str, optional) – Binarization method. As default, calculates with Otsu Thresholding.
- min (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
- max (int, optional) – Used if method is “minmax”. Annotation object defines foreground as the pixels with the value between “min” and “max”.
-
merge_include_coords
(rule)¶ Merge coordinations following the rule.
Parameters: rule (wsiprocess.rule.Rule) – Rule object.
-
read_annotation
(annotation_type=False)¶ Parse the annotation data.
Parameters: annotation_type (str) – If provided, pass the auto type detection.
-
wsiprocess.converter¶
Convert wsiprocess style annotation data to COCO or VOC style.
wsiprocess.error¶
Custom errors for wsiprocess.
-
exception
wsiprocess.error.
AnnotationLabelError
(message)¶ Error of annotations
Parameters: message (str) – Message to show in the stdout.
-
exception
wsiprocess.error.
MissCombinationError
(message)¶ Error of the combination of the method and the anntoation file.
- Args
- message (str): Message to show in the stdout.
-
exception
wsiprocess.error.
OnParamError
(message)¶ Error of on_annotation.
on_annotation must be more than 0 and up to 1.
Parameters: message (str) – Message to show in the stdout.
-
exception
wsiprocess.error.
PatchSizeTooSmallError
(message)¶ Error of the size of patches.
This should be warning?
Parameters: message (str) – Message to show in the stdout.
-
exception
wsiprocess.error.
SizeError
(message)¶ Error of sizes.
The slide size is larger than the patches, and the patch size is larger than the overlap size.
Parameters: message (str) – Message to show in the stdout.
-
exception
wsiprocess.error.
SlideLoadError
(message)¶ Error on loading slides.
Parameters: message (str) – Message to show in the stdout.
-
exception
wsiprocess.error.
WsiProcessError
(message)¶ Root error class.
Parameters: message (str) – Message to show in the stdout.
wsiprocess.patcher¶
Patcher object to extract patches from whole slide images.
-
class
wsiprocess.patcher.
Patcher
(slide, method, annotation=False, save_to='.', patch_width=256, patch_height=256, overlap_width=0, overlap_height=0, offset_x=0, offset_y=0, on_foreground=0.5, on_annotation=0.5, start_sample=False, finished_sample=False, no_patches=False, crop_bbox=False)¶ Patcher object.
Parameters: - slide (wsiprocess.slide.Slide) – Slide object.
- method (str) – Method name to run. One of {“none”, “classification”, “detection”, “segmentation}. Characters are converted to lowercase.
- annotation (wsiprocess.annotation.Annotation, optional) – Annotation object.
- save_to (str, optional) – The root of the output directory.
- patch_width (int, optional) – The width of the output patches.
- patch_height (int, optional) – The height of the output patches.
- overlap_width (int, optional) – The width of the overlap areas of patches.
- overlap_height (int, optional) – The height of the overlap areas of patches.
- offset_x (int, optional) – The offset pixels along the x-axis.
- offset_y (int, optional) – The offset pixels along the y-axis.
- on_foreground (float, optional) – Ratio of overlap area between patches and foreground area.
- on_annotation (float, optional) – Ratio of overlap area between patches and annotation.
- start_sample (bool, optional) – Whether to save sample patches on Patcher starting.
- finished_sample (bool, optional) – Whether to save sample patches on Patcher finished its work.
- extract_patches (bool, optional) – This is deprecated because unless “no_patches” is set, Patcher extracts patches.
- no_patches (bool, optional) – If set, Patcher runs without extracting patches and saves them to disk.
-
slide
¶ Slide object.
Type: wsiprocess.slide.Slide
-
wsi_width
¶ Width of the slide.
Type: int
-
wsi_height
¶ Height of the slide.
Type: int
-
filepath
¶ Path to the whole slide image.
Type: str
-
filestem
¶ Stem of the file name.
Type: str
-
method
¶ Method name to run. One of {“none”, “classification”, “detection”, “segmentation}
Type: str
-
annotation
¶ Annotation object.
Type: wsiprocess.annotation.Annotation
-
masks
¶ Masks to show the location of classes.
Type: dict
-
classes
¶ Classes to extract.
Type: list
-
save_to
¶ The root of the output directory.
Type: str
-
p_width
¶ The width of the output patches.
Type: int
-
p_height
¶ The height of the output patches.
Type: int
-
p_area
¶ The area of single patch.
Type: int
-
o_width
¶ The width of the overlap areas of patches.
Type: int
-
o_height
¶ The height of the overlap areas of patches.
Type: int
-
offset_x
¶ The The offset pixels along the x-axis.
Type: int
-
offset_y
¶ The The offset pixels along the y-axis.
Type: int
-
on_foreground
¶ Ratio of overlap area between patches and foreground area.
Type: float
-
on_annotation
¶ Ratio of overlap area between patches and annotation.
Type: float
-
start_sample
¶ Whether to save sample patches on Patcher start.
Type: bool
-
finished_sample
¶ Whether to save sample patches on Patcher finish.
Type: bool
-
extract_patches
¶ Whether to save patches when Patcher runs.
Type: bool
-
no_patches
¶ Whether to save patches when Patcher runs.
Type: bool
-
x_lefttop
¶ Offsets of patches to the x-axis direction except for the right edge.
Type: list
-
y_lefttop
¶ Offsets of patches to the y-axis direction except for the bottom edge.
Type: list
-
iterator
¶ Offset coordinates of patches.
Type: list
-
last_x
¶ X-axis offset of the right edge patch.
Type: int
-
last_y
¶ Y-axis offset of the right edge patch.
Type: int
-
result
¶ Temporary storage for the computed result of patches.
Type: dict
-
annotation_cover_patch
(coords, x, y)¶ Check if the annotation is covering the whole patch.
Parameters: - coords (np.array) – Coordinations of annotations.
- px (int) – X coordinate of left top corner of the patch.
- py (int) – Y coordinate of left top corner of the patch.
Returns: - List of np.int64s which are the indices
of bounding boxes on the patch.
Return type: idx_of_bb_on_patch (list)
-
corner_on_patch
(coords, x, y)¶ Check if at least one of the corners is on the patch.
Parameters: - coords (np.array) – Coordinations of annotations.
- px (int) – X coordinate of left top corner of the patch.
- py (int) – Y coordinate of left top corner of the patch.
Returns: - List of np.int64s which are the indices
of bounding boxes on the patch.
Return type: idx_of_bb_on_patch (list)
-
find_bbs
(x, y, cls)¶ Find bounding boxes which are on the patch.
- Bounding boxes with one of its corners on the patch is on the patch.
- ex : annotation.mask_coords[“benign”][0]
- = [small_x, small_y, large_x, large_y] = [bbleft, bbtop, bbright, bbbottom]
Parameters: - x (int) – X-axis offset of patch.
- y (int) – Y-axis offset of patch.
- cls (str) – Class of the patch or the bounding box or the segmented area.
-
find_masks
(x, y, cls)¶ Get the masked area corresponding to the given patch area.
Parameters: - x (int) – X-axis offset of a patch.
- y (int) – Y-axis offset of a patch.
- cls (str) – Class of the patch or the bounding box or the segmented area.
Returns: - List containing a dict of coords and its class. This
coords is a path to the png image.
Return type: masks (list)
-
get_mini_patch_parallel
(classes=False)¶
-
get_patch
(x, y, classes=False)¶ Extract a single patch.
Parameters: - x (int) – X-axis offset of a patch.
- y (int) – Y-axis offset of a patch.
- classes (list) – For the case of method is classification, extract the patch for multiple times if the patch is on the border of two or more classes. To prevent patcher to extract a single patch for multiple classes, on_annotation=1.0 should work.
-
get_patch_parallel
(classes=False, cores=-1)¶ Run get_patch() in parallel.
Parameters: - classes (list) – Classes to extract.
- cores (int) – Threads to run. -1 means same as the number of cores.
-
get_random_sample
(phase, sample_count=1)¶ Get random patch to check if the patcher can work properly.
Parameters: - phase (str) – When to check. One of {start, finish}
- sample_count (int) – Number of patches to extract.
-
patch_on_annotation
(cls, x, y)¶ Check if the patch is on the annotation area of a class.
Parameters: - cls (str) – Class of the patch or the bounding box or the segmented area.
- x (int) – X-axis offset of a patch.
- y (int) – Y-axis offset of a patch.
Returns: Whether the patch is on the anntation.
Return type: (bool)
-
patch_on_foreground
(x, y)¶ Check if the patch is on the foreground area.
Parameters: - x (int) – X-axis offset of a patch.
- y (int) – Y-axis offset of a patch.
Returns: Whether the patch is on the foreground area.
Return type: (bool)
-
remove_dup_in_results
()¶ Remove duplicate results in self.result[“results”]
-
save_patch_result
(x, y, cls)¶ Save the extracted patch data to result
Parameters: - x (int) – X-axis offset of patch.
- y (int) – Y-axis offset of patch.
- cls (str) – Class of the patch or the bounding box or the segmented area.
-
save_results
()¶ Save the extraction results.
Saves some metadata with the patches results.
-
side_on_patch
(coords, x, y)¶ Check if at least one of the side is on the patch.
Parameters: - coords (np.array) – Coordinations of annotations.
- px (int) – X coordinate of left top corner of the patch.
- py (int) – Y coordinate of left top corner of the patch.
Returns: - List of np.int64s which are the indices
of bounding boxes on the patch.
Return type: idx_of_bb_on_patch (list)
-
to_bb
(coord)¶ Convert coordinates to voc coordinates.
Parameters: coord (list) – List of coordinates stored as below:
[[xOfOneCorner, yOfOneCorner], [xOfApex, yOfApex]]
Returns: List of coordinates stored as below: [[xmin, ymin], [xmin, ymax], [xmax, ymax], [xmax, ymin]]
Return type: outer_coord (list)
wsiprocess.rule¶
Object to define rules for extracting patches.
Rule file should be a json file. The content of rule.json is like below.
Example
Json data below defines
- extract the patches of benign and malignant
- benign includes stroma but excludes malignant or uncertain
- malignant means malignant itself but excludes benign
{
"benign" : {
"includes" : [
"stroma"
],
"excludes : [
"malignant",
"uncertain"
]
},
"malignant": {
"includes" : [
],
"excludes" :[
"benign"
]
}
}
-
class
wsiprocess.rule.
Rule
(path)¶ Base class for rule.
Parameters: path (str) – Path to the rule.json file. -
classes
¶ List of the classes. i.e. [“benign”, “malignant”]
Type: list
-
read_rule
¶ Dict of rules. i.e. {“benign”: {“inclues”: [“stroma”]}}
Type: dict
-
read_rule
() Read the rule file.
Parse the rule file and save as the classes.
-
wsiprocess.slide¶
Slide object to pass to annotation object and patcher object. Slide is whole slide image, scanned with whole slide scanners. Mannually you can make pyramidical tiff file, which you can handle just the same as the scanned digital data, except for the magnification.
-
class
wsiprocess.slide.
Slide
(path)¶ Slide object.
Parameters: path (str) – Path to the whole slide image file. -
path
¶ Path to the whole slide image file.
Type: str
-
slide
¶ pyvips Image object.
Type: pyvips.Image
-
wsi_width
¶ Width of slide.
Type: int
-
wsi_height
¶ Height of slide.
Type: int
-
export_thumbnail
(save_as='./thumb.png', size=500)¶ Export thumbnail image.
Parameters: - save_as (str) – Path to save as the thumbnail image.
- size (int, optional) – Size of the exported thumbnail.
-
get_thumbnail
(size=500)¶ Get thumbnail image.
Parameters: size (int, optional) – Size of the exported thumbnail.
-
wsiprocess.utils¶
-
wsiprocess.utils.
show_bounding_box
(patch_path, result_path, save_as)¶
-
wsiprocess.utils.
show_mask_on_patch
(patch_path, mask_path, save_as)¶
wsiprocess.verify¶
Verification script runs before the patcher works. Verify class works for verification of the output directory, annotation files, rule files, etc. Mainly runs for cli.
-
class
wsiprocess.verify.
Verify
(save_to, filestem, method, start_sample, finished_sample, no_patches, crop_bbox)¶ Verification class.
Parameters: - save_to (str) – The root of the output directory.
- filestem (str) – The name of the output directory.
- method (str) – Method name to run. One of {“none”, “classification”, “detection”, “segmentation}
- start_sample (bool) – Whether to save sample patches on Patcher start.
- finished_sample (bool) – Whether to save sample patches on Patcher finish.
- extract_patches (bool) – [Deleted]Whether to save patches when Patcher runs.
- no_patches (bool) – Whether to save patches when Patcher runs.
-
save_to
¶ The root of the output directory.
Type: str
-
filestem
¶ The name of the output directory.
Type: str
-
method
¶ Method name to run. One of {“none”, “classification”, “detection”, “segmentation}
Type: str
-
start_sample
¶ Whether to save sample patches on Patcher start.
Type: bool
-
finished_sample
¶ Whether to save sample patches on Patcher finish.
Type: bool
-
extract_patches
¶ [Deleted]Whether to save patches when Patcher runs.
Type: bool
-
no_patches
¶ Whether to save patches when Patcher runs.
Type: bool
-
magnification
(slide, magnification)¶ Check if the slide has data for the magnification the user specified.
Parameters: - slide (wp.slide.Slide) – Slide object to check.
- magnification (int) – Target magnification value which has to be smaller than the magnification of the slide.
-
static
make_dir
(path)¶ Make output directory.
-
make_dirs
()¶ Ensure the output directories exists for each tasks.
-
on_params
(on_annotation, on_foreground)¶ Verify the ratio of on_annotation.
Parameters: - on_annotation (float) – Overlap ratio of patches and annotations.
- on_foreground (float) – Overlap ratio of patches and foreground area
Raises: wsiprocess.error.SizeError
– If the sizes are invalid.
-
sizes
(wsi_width, wsi_height, offset_x, offset_y, patch_width, patch_height, overlap_width, overlap_height, dot_bbox_width, dot_bbox_height)¶ Verify the sizes of the slide, the patch and the overlap area.
Raises: wsiprocess.error.SizeError
– If the sizes are invalid.