odak.learn.perception
odak.learn.perception
Defines a number of different perceptual loss functions, which can be used to optimise images where gaze location is known.
BlurLoss
¶
BlurLoss
implements two different blur losses. When blur_source
is set to False
, it implements blur_match, trying to match the input image to the blurred target image. This tries to match the source input image to a blurred version of the target.
When blur_source
is set to True
, it implements blur_lowpass, matching the blurred version of the input image to the blurred target image. This tries to match only the low frequencies of the source input image to the low frequencies of the target.
The interface is similar to other pytorch
loss functions, but note that the gaze location must be provided in addition to the source and target images.
Source code in odak/learn/perception/blur_loss.py
__call__(image, target, gaze=[0.5, 0.5])
¶
Calculates the Blur Loss.
Parameters:
-
image
–Image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
target
–Ground truth target image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
gaze
–Gaze location in the image, in normalized image coordinates (range [0, 1]) relative to the top left of the image.
Returns:
-
loss
(tensor
) –The computed loss.
Source code in odak/learn/perception/blur_loss.py
__init__(device=torch.device('cpu'), alpha=0.2, real_image_width=0.2, real_viewing_distance=0.7, mode='quadratic', blur_source=False, equi=False)
¶
Parameters:
-
alpha
–parameter controlling foveation - larger values mean bigger pooling regions.
-
real_image_width
–The real width of the image as displayed to the user. Units don't matter as long as they are the same as for real_viewing_distance.
-
real_viewing_distance
–The real distance of the observer's eyes to the image plane. Units don't matter as long as they are the same as for real_image_width.
-
mode
–Foveation mode, either "quadratic" or "linear". Controls how pooling regions grow as you move away from the fovea. We got best results with "quadratic".
-
blur_source
–If true, blurs the source image as well as the target before computing the loss.
-
equi
–If true, run the loss in equirectangular mode. The input is assumed to be an equirectangular format 360 image. The settings real_image_width and real_viewing distance are ignored. The gaze argument is instead interpreted as gaze angles, and should be in the range [-pi,pi]x[-pi/2,pi]
Source code in odak/learn/perception/blur_loss.py
CVVDP
¶
Bases: Module
Source code in odak/learn/perception/learned_perceptual_losses.py
__init__(device=torch.device('cpu'))
¶
Initializes the CVVDP model with a specified device.
Parameters:
-
device
–The device (CPU/GPU) on which the computations will be performed. Defaults to CPU.
Source code in odak/learn/perception/learned_perceptual_losses.py
forward(predictions, targets, dim_order='CHW')
¶
Parameters:
-
image
–The predicted images.
-
ground_truth
–The ground truth images.
-
dim_order
–The dimension order of the input images. Defaults to 'CHW' (channels, height, width).
Returns:
-
result
(tensor
) –The computed loss if successful, otherwise 0.0.
Source code in odak/learn/perception/learned_perceptual_losses.py
FVVDP
¶
Bases: Module
Source code in odak/learn/perception/learned_perceptual_losses.py
__init__(device=torch.device('cpu'))
¶
Initializes the FVVDP model with a specified device.
Parameters:
-
device
–The device (CPU/GPU) on which the computations will be performed. Defaults to CPU.
Source code in odak/learn/perception/learned_perceptual_losses.py
forward(predictions, targets, dim_order='CHW')
¶
Parameters:
-
image
–The predicted images.
-
ground_truth
–The ground truth images.
-
dim_order
–The dimension order of the input images. Defaults to 'CHW' (channels, height, width).
Returns:
-
result
(tensor
) –The computed loss if successful, otherwise 0.0.
Source code in odak/learn/perception/learned_perceptual_losses.py
LPIPS
¶
Bases: Module
Source code in odak/learn/perception/learned_perceptual_losses.py
__init__()
¶
Initializes the LPIPS (Learned Perceptual Image Patch Similarity) model.
Source code in odak/learn/perception/learned_perceptual_losses.py
forward(predictions, targets)
¶
Parameters:
-
image
–The predicted images.
-
ground_truth
–The ground truth images.
Returns:
-
result
(tensor
) –The computed loss if successful, otherwise 0.0.
Source code in odak/learn/perception/learned_perceptual_losses.py
MSSSIM
¶
Bases: Module
A class to calculate multi-scale structural similarity index of an image with respect to a ground truth image.
Source code in odak/learn/perception/image_quality_losses.py
forward(predictions, targets)
¶
Parameters:
-
predictions
(tensor
) –The predicted images.
-
targets
–The ground truth images.
Returns:
-
result
(tensor
) –The computed MS-SSIM value if successful, otherwise 0.0.
Source code in odak/learn/perception/image_quality_losses.py
MetamerMSELoss
¶
The MetamerMSELoss
class provides a perceptual loss function. This generates a metamer for the target image, and then optimises the source image to be the same as this target image metamer.
Please note this is different to MetamericLoss
which optimises the source image to be any metamer of the target image.
Its interface is similar to other pytorch
loss functions, but note that the gaze location must be provided in addition to the source and target images.
Source code in odak/learn/perception/metamer_mse_loss.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
|
__call__(image, target, gaze=[0.5, 0.5])
¶
Calculates the Metamer MSE Loss.
Parameters:
-
image
–Image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
target
–Ground truth target image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
gaze
–Gaze location in the image, in normalized image coordinates (range [0, 1]) relative to the top left of the image.
Returns:
-
loss
(tensor
) –The computed loss.
Source code in odak/learn/perception/metamer_mse_loss.py
__init__(device=torch.device('cpu'), alpha=0.2, real_image_width=0.2, real_viewing_distance=0.7, mode='quadratic', n_pyramid_levels=5, n_orientations=2, equi=False)
¶
Parameters:
-
alpha
–parameter controlling foveation - larger values mean bigger pooling regions.
-
real_image_width
–The real width of the image as displayed to the user. Units don't matter as long as they are the same as for real_viewing_distance.
-
real_viewing_distance
–The real distance of the observer's eyes to the image plane. Units don't matter as long as they are the same as for real_image_width.
-
n_pyramid_levels
–Number of levels of the steerable pyramid. Note that the image is padded so that both height and width are multiples of 2^(n_pyramid_levels), so setting this value too high will slow down the calculation a lot.
-
mode
–Foveation mode, either "quadratic" or "linear". Controls how pooling regions grow as you move away from the fovea. We got best results with "quadratic".
-
n_orientations
–Number of orientations in the steerable pyramid. Can be 1, 2, 4 or 6. Increasing this will increase runtime.
-
equi
–If true, run the loss in equirectangular mode. The input is assumed to be an equirectangular format 360 image. The settings real_image_width and real_viewing distance are ignored. The gaze argument is instead interpreted as gaze angles, and should be in the range [-pi,pi]x[-pi/2,pi]
Source code in odak/learn/perception/metamer_mse_loss.py
gen_metamer(image, gaze)
¶
Generates a metamer for an image, following the method in this paper This function can be used on its own to generate a metamer for a desired image.
Parameters:
-
image
–Image to compute metamer for. Should be an RGB image in NCHW format (4 dimensions)
-
gaze
–Gaze location in the image, in normalized image coordinates (range [0, 1]) relative to the top left of the image.
Returns:
-
metamer
(tensor
) –The generated metamer image
Source code in odak/learn/perception/metamer_mse_loss.py
MetamericLoss
¶
The MetamericLoss
class provides a perceptual loss function.
Rather than exactly match the source image to the target, it tries to ensure the source is a metamer to the target image.
Its interface is similar to other pytorch
loss functions, but note that the gaze location must be provided in addition to the source and target images.
Source code in odak/learn/perception/metameric_loss.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 |
|
__call__(image, target, gaze=[0.5, 0.5], image_colorspace='RGB', visualise_loss=False)
¶
Calculates the Metameric Loss.
Parameters:
-
image
–Image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
target
–Ground truth target image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
image_colorspace
–The current colorspace of your image and target. Ignored if input does not have 3 channels. accepted values: RGB, YCrCb.
-
gaze
–Gaze location in the image, in normalized image coordinates (range [0, 1]) relative to the top left of the image.
-
visualise_loss
–Shows a heatmap indicating which parts of the image contributed most to the loss.
Returns:
-
loss
(tensor
) –The computed loss.
Source code in odak/learn/perception/metameric_loss.py
__init__(device=torch.device('cpu'), alpha=0.2, real_image_width=0.2, real_viewing_distance=0.7, n_pyramid_levels=5, mode='quadratic', n_orientations=2, use_l2_foveal_loss=True, fovea_weight=20.0, use_radial_weight=False, use_fullres_l0=False, equi=False)
¶
Parameters:
-
alpha
–parameter controlling foveation - larger values mean bigger pooling regions.
-
real_image_width
–The real width of the image as displayed to the user. Units don't matter as long as they are the same as for real_viewing_distance.
-
real_viewing_distance
–The real distance of the observer's eyes to the image plane. Units don't matter as long as they are the same as for real_image_width.
-
n_pyramid_levels
–Number of levels of the steerable pyramid. Note that the image is padded so that both height and width are multiples of 2^(n_pyramid_levels), so setting this value too high will slow down the calculation a lot.
-
mode
–Foveation mode, either "quadratic" or "linear". Controls how pooling regions grow as you move away from the fovea. We got best results with "quadratic".
-
n_orientations
–Number of orientations in the steerable pyramid. Can be 1, 2, 4 or 6. Increasing this will increase runtime.
-
use_l2_foveal_loss
–If true, for all the pixels that have pooling size 1 pixel in the largest scale will use direct L2 against target rather than pooling over pyramid levels. In practice this gives better results when the loss is used for holography.
-
fovea_weight
–A weight to apply to the foveal region if use_l2_foveal_loss is set to True.
-
use_radial_weight
–If True, will apply a radial weighting when calculating the difference between the source and target stats maps. This weights stats closer to the fovea more than those further away.
-
use_fullres_l0
–If true, stats for the lowpass residual are replaced with blurred versions of the full-resolution source and target images.
-
equi
–If true, run the loss in equirectangular mode. The input is assumed to be an equirectangular format 360 image. The settings real_image_width and real_viewing distance are ignored. The gaze argument is instead interpreted as gaze angles, and should be in the range [-pi,pi]x[-pi/2,pi]
Source code in odak/learn/perception/metameric_loss.py
MetamericLossUniform
¶
Measures metameric loss between a given image and a metamer of the given target image. This variant of the metameric loss is not foveated - it applies uniform pooling sizes to the whole input image.
Source code in odak/learn/perception/metameric_loss_uniform.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
|
__call__(image, target, image_colorspace='RGB', visualise_loss=False)
¶
Calculates the Metameric Loss.
Parameters:
-
image
–Image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
target
–Ground truth target image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
image_colorspace
–The current colorspace of your image and target. Ignored if input does not have 3 channels. accepted values: RGB, YCrCb.
-
visualise_loss
–Shows a heatmap indicating which parts of the image contributed most to the loss.
Returns:
-
loss
(tensor
) –The computed loss.
Source code in odak/learn/perception/metameric_loss_uniform.py
__init__(device=torch.device('cpu'), pooling_size=32, n_pyramid_levels=5, n_orientations=2)
¶
Parameters:
-
pooling_size
–Pooling size, in pixels. For example 32 will pool over 32x32 blocks of the image.
-
n_pyramid_levels
–Number of levels of the steerable pyramid. Note that the image is padded so that both height and width are multiples of 2^(n_pyramid_levels), so setting this value too high will slow down the calculation a lot.
-
n_orientations
–Number of orientations in the steerable pyramid. Can be 1, 2, 4 or 6. Increasing this will increase runtime.
Source code in odak/learn/perception/metameric_loss_uniform.py
gen_metamer(image)
¶
Generates a metamer for an image, following the method in this paper This function can be used on its own to generate a metamer for a desired image.
Parameters:
-
image
–Image to compute metamer for. Should be an RGB image in NCHW format (4 dimensions)
Returns:
-
metamer
(tensor
) –The generated metamer image
Source code in odak/learn/perception/metameric_loss_uniform.py
PSNR
¶
Bases: Module
A class to calculate peak-signal-to-noise ratio of an image with respect to a ground truth image.
Source code in odak/learn/perception/image_quality_losses.py
forward(predictions, targets, peak_value=1.0)
¶
A function to calculate peak-signal-to-noise ratio of an image with respect to a ground truth image.
Parameters:
-
image
–Image to be tested.
-
ground_truth
–Ground truth image.
-
peak_value
–Peak value that given tensors could have.
Returns:
-
result
(tensor
) –Peak-signal-to-noise ratio.
Source code in odak/learn/perception/image_quality_losses.py
RadiallyVaryingBlur
¶
The RadiallyVaryingBlur
class provides a way to apply a radially varying blur to an image. Given a gaze location and information about the image and foveation, it applies a blur that will achieve the proper pooling size. The pooling size is chosen to appear the same at a range of display sizes and viewing distances, for a given alpha
parameter value. For more information on how the pooling sizes are computed, please see link coming soon.
The blur is accelerated by generating and sampling from MIP maps of the input image.
This class caches the foveation information. This means that if it is run repeatedly with the same foveation parameters, gaze location and image size (e.g. in an optimisation loop) it won't recalculate the pooling maps.
If you are repeatedly applying blur to images of different sizes (e.g. a pyramid) for best performance use one instance of this class per image size.
Source code in odak/learn/perception/radially_varying_blur.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
|
blur(image, alpha=0.2, real_image_width=0.2, real_viewing_distance=0.7, centre=None, mode='quadratic', equi=False)
¶
Apply the radially varying blur to an image.
Parameters:
-
image
–The image to blur, in NCHW format.
-
alpha
–parameter controlling foveation - larger values mean bigger pooling regions.
-
real_image_width
–The real width of the image as displayed to the user. Units don't matter as long as they are the same as for real_viewing_distance. Ignored in equirectangular mode (equi==True)
-
real_viewing_distance
–The real distance of the observer's eyes to the image plane. Units don't matter as long as they are the same as for real_image_width. Ignored in equirectangular mode (equi==True)
-
centre
–The centre of the radially varying blur (the gaze location). Should be a tuple of floats containing normalised image coordinates in range [0,1] In equirectangular mode this should be yaw & pitch angles in [-pi,pi]x[-pi/2,pi/2]
-
mode
–Foveation mode, either "quadratic" or "linear". Controls how pooling regions grow as you move away from the fovea. We got best results with "quadratic".
-
equi
–If true, run the blur function in equirectangular mode. The input is assumed to be an equirectangular format 360 image. The settings real_image_width and real_viewing distance are ignored. The centre argument is instead interpreted as gaze angles, and should be in the range [-pi,pi]x[-pi/2,pi]
Returns:
-
output
(tensor
) –The blurred image
Source code in odak/learn/perception/radially_varying_blur.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
|
SSIM
¶
Bases: Module
A class to calculate structural similarity index of an image with respect to a ground truth image.
Source code in odak/learn/perception/image_quality_losses.py
forward(predictions, targets)
¶
Parameters:
-
predictions
(tensor
) –The predicted images.
-
targets
–The ground truth images.
Returns:
-
result
(tensor
) –The computed SSIM value if successful, otherwise 0.0.
Source code in odak/learn/perception/image_quality_losses.py
SpatialSteerablePyramid
¶
This implements a real-valued steerable pyramid where the filtering is carried out spatially (using convolution) as opposed to multiplication in the Fourier domain. This has a number of optimisations over previous implementations that increase efficiency, but introduce some reconstruction error.
Source code in odak/learn/perception/spatial_steerable_pyramid.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 |
|
__init__(use_bilinear_downup=True, n_channels=1, filter_size=9, n_orientations=6, filter_type='full', device=torch.device('cpu'))
¶
Parameters:
-
use_bilinear_downup
–This uses bilinear filtering when upsampling/downsampling, rather than the original approach of applying a large lowpass kernel and sampling even rows/columns
-
n_channels
–Number of channels in the input images (e.g. 3 for RGB input)
-
filter_size
–Desired size of filters (e.g. 3 will use 3x3 filters).
-
n_orientations
–Number of oriented bands in each level of the pyramid.
-
filter_type
–This can be used to select smaller filters than the original ones if desired. full: Original filter sizes cropped: Some filters are cut back in size by extracting the centre and scaling as appropriate. trained: Same as reduced, but the oriented kernels are replaced by learned 5x5 kernels.
-
device
–torch device the input images will be supplied from.
Source code in odak/learn/perception/spatial_steerable_pyramid.py
construct_pyramid(image, n_levels, multiple_highpass=False)
¶
Constructs and returns a steerable pyramid for the provided image.
Parameters:
-
image
–The input image, in NCHW format. The number of channels C should match num_channels when the pyramid maker was created.
-
n_levels
–Number of levels in the constructed steerable pyramid.
-
multiple_highpass
–If true, computes a highpass for each level of the pyramid. These extra levels are redundant (not used for reconstruction).
Returns:
-
pyramid
(list of dicts of torch.tensor
) –The computed steerable pyramid. Each level is an entry in a list. The pyramid is ordered from largest levels to smallest levels. Each level is stored as a dict, with the following keys: "h" Highpass residual "l" Lowpass residual "b" Oriented bands (a list of torch.tensor)
Source code in odak/learn/perception/spatial_steerable_pyramid.py
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 |
|
reconstruct_from_pyramid(pyramid)
¶
Reconstructs an input image from a steerable pyramid.
Parameters:
-
pyramid
(list of dicts of torch.tensor
) –The steerable pyramid. Should be in the same format as output by construct_steerable_pyramid(). The number of channels should match num_channels when the pyramid maker was created.
Returns:
-
image
(tensor
) –The reconstructed image, in NCHW format.
Source code in odak/learn/perception/spatial_steerable_pyramid.py
display_color_hvs
¶
Source code in odak/learn/perception/color_conversion.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 |
|
__call__(input_image, ground_truth, gaze=None)
¶
Evaluating an input image against a target ground truth image for a given gaze of a viewer.
Source code in odak/learn/perception/color_conversion.py
__init__(resolution=[1920, 1080], distance_from_screen=800, pixel_pitch=0.311, read_spectrum='tensor', primaries_spectrum=torch.rand(3, 301), device=torch.device('cpu'))
¶
Parameters:
-
resolution
–Resolution of the display in pixels.
-
distance_from_screen
–Distance from the screen in mm.
-
pixel_pitch
–Pixel pitch of the display in mm.
-
read_spectrum
–Spectrum of the display. Default is 'default' which is the spectrum of the Dell U2415 display.
-
device
–Device to run the code on. Default is None which means the code will run on CPU.
Source code in odak/learn/perception/color_conversion.py
cone_response_to_spectrum(cone_spectrum, light_spectrum)
¶
Internal function to calculate cone response at particular light spectrum.
Parameters:
-
cone_spectrum
–Spectrum, Wavelength [2,300] tensor
-
light_spectrum
–Spectrum, Wavelength [2,300] tensor
Returns:
-
response_to_spectrum
(float
) –Response of cone to light spectrum [1x1]
Source code in odak/learn/perception/color_conversion.py
construct_matrix_lms(l_response, m_response, s_response)
¶
Internal function to calculate cone response at particular light spectrum.
Parameters:
-
l_response
–Cone response spectrum tensor (normalised response vs wavelength)
-
m_response
–Cone response spectrum tensor (normalised response vs wavelength)
-
s_response
–Cone response spectrum tensor (normalised response vs wavelength)
Returns:
-
lms_image_tensor
(tensor
) –3x3 LMSrgb tensor
Source code in odak/learn/perception/color_conversion.py
construct_matrix_primaries(l_response, m_response, s_response)
¶
Internal function to calculate cone response at particular light spectrum.
Parameters:
-
l_response
–Cone response spectrum tensor (normalised response vs wavelength)
-
m_response
–Cone response spectrum tensor (normalised response vs wavelength)
-
s_response
–Cone response spectrum tensor (normalised response vs wavelength)
Returns:
-
lms_image_tensor
(tensor
) –3x3 LMSrgb tensor
Source code in odak/learn/perception/color_conversion.py
display_spectrum_response(wavelength, function)
¶
Internal function to provide light spectrum response at particular wavelength
Parameters:
-
wavelength
–Wavelength in nm [400...700]
-
function
–Display light spectrum distribution function
Returns:
-
ligth_response_dict
(float
) –Display light spectrum response value
Source code in odak/learn/perception/color_conversion.py
initialize_cones_normalised()
¶
Internal function to initialize normalised L,M,S cones as normal distribution with given sigma, and mu values.
Returns:
-
l_cone_n
(tensor
) –Normalised L cone distribution.
-
m_cone_n
(tensor
) –Normalised M cone distribution.
-
s_cone_n
(tensor
) –Normalised S cone distribution.