odak.learn.perception
odak.learn.perception
Defines a number of different perceptual loss functions, which can be used to optimise images where gaze location is known.
BlurLoss
¶
BlurLoss
implements two different blur losses. When blur_source
is set to False
, it implements blur_match, trying to match the input image to the blurred target image. This tries to match the source input image to a blurred version of the target.
When blur_source
is set to True
, it implements blur_lowpass, matching the blurred version of the input image to the blurred target image. This tries to match only the low frequencies of the source input image to the low frequencies of the target.
The interface is similar to other pytorch
loss functions, but note that the gaze location must be provided in addition to the source and target images.
Source code in odak/learn/perception/blur_loss.py
__call__(image, target, gaze=[0.5, 0.5])
¶
Calculates the Blur Loss.
Parameters:
-
image
βImage to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
target
βGround truth target image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
gaze
βGaze location in the image, in normalized image coordinates (range [0, 1]) relative to the top left of the image.
Returns:
-
loss
(tensor
) βThe computed loss.
Source code in odak/learn/perception/blur_loss.py
__init__(device=torch.device('cpu'), alpha=0.2, real_image_width=0.2, real_viewing_distance=0.7, mode='quadratic', blur_source=False, equi=False)
¶
Parameters:
-
alpha
βparameter controlling foveation - larger values mean bigger pooling regions.
-
real_image_width
βThe real width of the image as displayed to the user. Units don't matter as long as they are the same as for real_viewing_distance.
-
real_viewing_distance
βThe real distance of the observer's eyes to the image plane. Units don't matter as long as they are the same as for real_image_width.
-
mode
βFoveation mode, either "quadratic" or "linear". Controls how pooling regions grow as you move away from the fovea. We got best results with "quadratic".
-
blur_source
βIf true, blurs the source image as well as the target before computing the loss.
-
equi
βIf true, run the loss in equirectangular mode. The input is assumed to be an equirectangular format 360 image. The settings real_image_width and real_viewing distance are ignored. The gaze argument is instead interpreted as gaze angles, and should be in the range [-pi,pi]x[-pi/2,pi]
Source code in odak/learn/perception/blur_loss.py
CVVDP
¶
Bases: Module
Source code in odak/learn/perception/learned_perceptual_losses.py
__init__(device=torch.device('cpu'))
¶
Initializes the CVVDP model with a specified device.
Parameters:
-
device
βThe device (CPU/GPU) on which the computations will be performed. Defaults to CPU.
Source code in odak/learn/perception/learned_perceptual_losses.py
forward(predictions, targets, dim_order='BCHW')
¶
Parameters:
-
predictions
βThe predicted images.
-
targets
βThe ground truth images.
-
dim_order
βThe dimension order of the input images. Defaults to 'BCHW' (channels, height, width).
Returns:
-
result
(tensor
) βThe computed loss if successful, otherwise 0.0.
Source code in odak/learn/perception/learned_perceptual_losses.py
FVVDP
¶
Bases: Module
Source code in odak/learn/perception/learned_perceptual_losses.py
__init__(device=torch.device('cpu'))
¶
Initializes the FVVDP model with a specified device.
Parameters:
-
device
βThe device (CPU/GPU) on which the computations will be performed. Defaults to CPU.
Source code in odak/learn/perception/learned_perceptual_losses.py
forward(predictions, targets, dim_order='BCHW')
¶
Parameters:
-
predictions
βThe predicted images.
-
targets
βThe ground truth images.
-
dim_order
βThe dimension order of the input images. Defaults to 'BCHW' (channels, height, width).
Returns:
-
result
(tensor
) βThe computed loss if successful, otherwise 0.0.
Source code in odak/learn/perception/learned_perceptual_losses.py
LPIPS
¶
Bases: Module
Source code in odak/learn/perception/learned_perceptual_losses.py
__init__()
¶
Initializes the LPIPS (Learned Perceptual Image Patch Similarity) model.
Source code in odak/learn/perception/learned_perceptual_losses.py
forward(predictions, targets)
¶
Parameters:
-
predictions
βThe predicted images.
-
targets
βThe ground truth images.
Returns:
-
result
(tensor
) βThe computed loss if successful, otherwise 0.0.
Source code in odak/learn/perception/learned_perceptual_losses.py
MSSSIM
¶
Bases: Module
A class to calculate multi-scale structural similarity index of an image with respect to a ground truth image.
Source code in odak/learn/perception/image_quality_losses.py
forward(predictions, targets)
¶
Parameters:
-
predictions
(tensor
) βThe predicted images.
-
targets
βThe ground truth images.
Returns:
-
result
(tensor
) βThe computed MS-SSIM value if successful, otherwise 0.0.
Source code in odak/learn/perception/image_quality_losses.py
MetamerMSELoss
¶
The MetamerMSELoss
class provides a perceptual loss function. This generates a metamer for the target image, and then optimises the source image to be the same as this target image metamer.
Please note this is different to MetamericLoss
which optimises the source image to be any metamer of the target image.
Its interface is similar to other pytorch
loss functions, but note that the gaze location must be provided in addition to the source and target images.
Source code in odak/learn/perception/metamer_mse_loss.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
|
__call__(image, target, gaze=[0.5, 0.5])
¶
Calculates the Metamer MSE Loss.
Parameters:
-
image
βImage to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
target
βGround truth target image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
gaze
βGaze location in the image, in normalized image coordinates (range [0, 1]) relative to the top left of the image.
Returns:
-
loss
(tensor
) βThe computed loss.
Source code in odak/learn/perception/metamer_mse_loss.py
__init__(device=torch.device('cpu'), alpha=0.2, real_image_width=0.2, real_viewing_distance=0.7, mode='quadratic', n_pyramid_levels=5, n_orientations=2, equi=False)
¶
Parameters:
-
alpha
βparameter controlling foveation - larger values mean bigger pooling regions.
-
real_image_width
βThe real width of the image as displayed to the user. Units don't matter as long as they are the same as for real_viewing_distance.
-
real_viewing_distance
βThe real distance of the observer's eyes to the image plane. Units don't matter as long as they are the same as for real_image_width.
-
n_pyramid_levels
βNumber of levels of the steerable pyramid. Note that the image is padded so that both height and width are multiples of 2^(n_pyramid_levels), so setting this value too high will slow down the calculation a lot.
-
mode
βFoveation mode, either "quadratic" or "linear". Controls how pooling regions grow as you move away from the fovea. We got best results with "quadratic".
-
n_orientations
βNumber of orientations in the steerable pyramid. Can be 1, 2, 4 or 6. Increasing this will increase runtime.
-
equi
βIf true, run the loss in equirectangular mode. The input is assumed to be an equirectangular format 360 image. The settings real_image_width and real_viewing distance are ignored. The gaze argument is instead interpreted as gaze angles, and should be in the range [-pi,pi]x[-pi/2,pi]
Source code in odak/learn/perception/metamer_mse_loss.py
gen_metamer(image, gaze)
¶
Generates a metamer for an image, following the method in this paper This function can be used on its own to generate a metamer for a desired image.
Parameters:
-
image
βImage to compute metamer for. Should be an RGB image in NCHW format (4 dimensions)
-
gaze
βGaze location in the image, in normalized image coordinates (range [0, 1]) relative to the top left of the image.
Returns:
-
metamer
(tensor
) βThe generated metamer image
Source code in odak/learn/perception/metamer_mse_loss.py
MetamericLoss
¶
The MetamericLoss
class provides a perceptual loss function.
Rather than exactly match the source image to the target, it tries to ensure the source is a metamer to the target image.
Its interface is similar to other pytorch
loss functions, but note that the gaze location must be provided in addition to the source and target images.
Source code in odak/learn/perception/metameric_loss.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 |
|
__call__(image, target, gaze=[0.5, 0.5], image_colorspace='RGB', visualise_loss=False)
¶
Calculates the Metameric Loss.
Parameters:
-
image
βImage to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
target
βGround truth target image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
image_colorspace
βThe current colorspace of your image and target. Ignored if input does not have 3 channels. accepted values: RGB, YCrCb.
-
gaze
βGaze location in the image, in normalized image coordinates (range [0, 1]) relative to the top left of the image.
-
visualise_loss
βShows a heatmap indicating which parts of the image contributed most to the loss.
Returns:
-
loss
(tensor
) βThe computed loss.
Source code in odak/learn/perception/metameric_loss.py
__init__(device=torch.device('cpu'), alpha=0.2, real_image_width=0.2, real_viewing_distance=0.7, n_pyramid_levels=5, mode='quadratic', n_orientations=2, use_l2_foveal_loss=True, fovea_weight=20.0, use_radial_weight=False, use_fullres_l0=False, equi=False)
¶
Parameters:
-
alpha
βparameter controlling foveation - larger values mean bigger pooling regions.
-
real_image_width
βThe real width of the image as displayed to the user. Units don't matter as long as they are the same as for real_viewing_distance.
-
real_viewing_distance
βThe real distance of the observer's eyes to the image plane. Units don't matter as long as they are the same as for real_image_width.
-
n_pyramid_levels
βNumber of levels of the steerable pyramid. Note that the image is padded so that both height and width are multiples of 2^(n_pyramid_levels), so setting this value too high will slow down the calculation a lot.
-
mode
βFoveation mode, either "quadratic" or "linear". Controls how pooling regions grow as you move away from the fovea. We got best results with "quadratic".
-
n_orientations
βNumber of orientations in the steerable pyramid. Can be 1, 2, 4 or 6. Increasing this will increase runtime.
-
use_l2_foveal_loss
βIf true, for all the pixels that have pooling size 1 pixel in the largest scale will use direct L2 against target rather than pooling over pyramid levels. In practice this gives better results when the loss is used for holography.
-
fovea_weight
βA weight to apply to the foveal region if use_l2_foveal_loss is set to True.
-
use_radial_weight
βIf True, will apply a radial weighting when calculating the difference between the source and target stats maps. This weights stats closer to the fovea more than those further away.
-
use_fullres_l0
βIf true, stats for the lowpass residual are replaced with blurred versions of the full-resolution source and target images.
-
equi
βIf true, run the loss in equirectangular mode. The input is assumed to be an equirectangular format 360 image. The settings real_image_width and real_viewing distance are ignored. The gaze argument is instead interpreted as gaze angles, and should be in the range [-pi,pi]x[-pi/2,pi]
Source code in odak/learn/perception/metameric_loss.py
MetamericLossUniform
¶
Measures metameric loss between a given image and a metamer of the given target image. This variant of the metameric loss is not foveated - it applies uniform pooling sizes to the whole input image.
Source code in odak/learn/perception/metameric_loss_uniform.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
|
__call__(image, target, image_colorspace='RGB', visualise_loss=False)
¶
Calculates the Metameric Loss.
Parameters:
-
image
βImage to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
target
βGround truth target image to compute loss for. Should be an RGB image in NCHW format (4 dimensions)
-
image_colorspace
βThe current colorspace of your image and target. Ignored if input does not have 3 channels. accepted values: RGB, YCrCb.
-
visualise_loss
βShows a heatmap indicating which parts of the image contributed most to the loss.
Returns:
-
loss
(tensor
) βThe computed loss.
Source code in odak/learn/perception/metameric_loss_uniform.py
__init__(device=torch.device('cpu'), pooling_size=32, n_pyramid_levels=5, n_orientations=2)
¶
Parameters:
-
pooling_size
βPooling size, in pixels. For example 32 will pool over 32x32 blocks of the image.
-
n_pyramid_levels
βNumber of levels of the steerable pyramid. Note that the image is padded so that both height and width are multiples of 2^(n_pyramid_levels), so setting this value too high will slow down the calculation a lot.
-
n_orientations
βNumber of orientations in the steerable pyramid. Can be 1, 2, 4 or 6. Increasing this will increase runtime.
Source code in odak/learn/perception/metameric_loss_uniform.py
gen_metamer(image)
¶
Generates a metamer for an image, following the method in this paper This function can be used on its own to generate a metamer for a desired image.
Parameters:
-
image
βImage to compute metamer for. Should be an RGB image in NCHW format (4 dimensions)
Returns:
-
metamer
(tensor
) βThe generated metamer image
Source code in odak/learn/perception/metameric_loss_uniform.py
PSNR
¶
Bases: Module
A class to calculate peak-signal-to-noise ratio of an image with respect to a ground truth image.
Source code in odak/learn/perception/image_quality_losses.py
forward(predictions, targets, peak_value=1.0)
¶
A function to calculate peak-signal-to-noise ratio of an image with respect to a ground truth image.
Parameters:
-
predictions
βImage to be tested.
-
targets
βGround truth image.
-
peak_value
βPeak value that given tensors could have.
Returns:
-
result
(tensor
) βPeak-signal-to-noise ratio.
Source code in odak/learn/perception/image_quality_losses.py
RadiallyVaryingBlur
¶
The RadiallyVaryingBlur
class provides a way to apply a radially varying blur to an image. Given a gaze location and information about the image and foveation, it applies a blur that will achieve the proper pooling size. The pooling size is chosen to appear the same at a range of display sizes and viewing distances, for a given alpha
parameter value. For more information on how the pooling sizes are computed, please see link coming soon.
The blur is accelerated by generating and sampling from MIP maps of the input image.
This class caches the foveation information. This means that if it is run repeatedly with the same foveation parameters, gaze location and image size (e.g. in an optimisation loop) it won't recalculate the pooling maps.
If you are repeatedly applying blur to images of different sizes (e.g. a pyramid) for best performance use one instance of this class per image size.
Source code in odak/learn/perception/radially_varying_blur.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
|
blur(image, alpha=0.2, real_image_width=0.2, real_viewing_distance=0.7, centre=None, mode='quadratic', equi=False)
¶
Apply the radially varying blur to an image.
Parameters:
-
image
βThe image to blur, in NCHW format.
-
alpha
βparameter controlling foveation - larger values mean bigger pooling regions.
-
real_image_width
βThe real width of the image as displayed to the user. Units don't matter as long as they are the same as for real_viewing_distance. Ignored in equirectangular mode (equi==True)
-
real_viewing_distance
βThe real distance of the observer's eyes to the image plane. Units don't matter as long as they are the same as for real_image_width. Ignored in equirectangular mode (equi==True)
-
centre
βThe centre of the radially varying blur (the gaze location). Should be a tuple of floats containing normalised image coordinates in range [0,1] In equirectangular mode this should be yaw & pitch angles in [-pi,pi]x[-pi/2,pi/2]
-
mode
βFoveation mode, either "quadratic" or "linear". Controls how pooling regions grow as you move away from the fovea. We got best results with "quadratic".
-
equi
βIf true, run the blur function in equirectangular mode. The input is assumed to be an equirectangular format 360 image. The settings real_image_width and real_viewing distance are ignored. The centre argument is instead interpreted as gaze angles, and should be in the range [-pi,pi]x[-pi/2,pi]
Returns:
-
output
(tensor
) βThe blurred image
Source code in odak/learn/perception/radially_varying_blur.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
|
SSIM
¶
Bases: Module
A class to calculate structural similarity index of an image with respect to a ground truth image.
Source code in odak/learn/perception/image_quality_losses.py
forward(predictions, targets)
¶
Parameters:
-
predictions
(tensor
) βThe predicted images.
-
targets
βThe ground truth images.
Returns:
-
result
(tensor
) βThe computed SSIM value if successful, otherwise 0.0.
Source code in odak/learn/perception/image_quality_losses.py
SpatialSteerablePyramid
¶
This implements a real-valued steerable pyramid where the filtering is carried out spatially (using convolution) as opposed to multiplication in the Fourier domain. This has a number of optimisations over previous implementations that increase efficiency, but introduce some reconstruction error.
Source code in odak/learn/perception/spatial_steerable_pyramid.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 |
|
__init__(use_bilinear_downup=True, n_channels=1, filter_size=9, n_orientations=6, filter_type='full', device=torch.device('cpu'))
¶
Parameters:
-
use_bilinear_downup
βThis uses bilinear filtering when upsampling/downsampling, rather than the original approach of applying a large lowpass kernel and sampling even rows/columns
-
n_channels
βNumber of channels in the input images (e.g. 3 for RGB input)
-
filter_size
βDesired size of filters (e.g. 3 will use 3x3 filters).
-
n_orientations
βNumber of oriented bands in each level of the pyramid.
-
filter_type
βThis can be used to select smaller filters than the original ones if desired. full: Original filter sizes cropped: Some filters are cut back in size by extracting the centre and scaling as appropriate. trained: Same as reduced, but the oriented kernels are replaced by learned 5x5 kernels.
-
device
βtorch device the input images will be supplied from.
Source code in odak/learn/perception/spatial_steerable_pyramid.py
construct_pyramid(image, n_levels, multiple_highpass=False)
¶
Constructs and returns a steerable pyramid for the provided image.
Parameters:
-
image
βThe input image, in NCHW format. The number of channels C should match num_channels when the pyramid maker was created.
-
n_levels
βNumber of levels in the constructed steerable pyramid.
-
multiple_highpass
βIf true, computes a highpass for each level of the pyramid. These extra levels are redundant (not used for reconstruction).
Returns:
-
pyramid
(list of dicts of torch.tensor
) βThe computed steerable pyramid. Each level is an entry in a list. The pyramid is ordered from largest levels to smallest levels. Each level is stored as a dict, with the following keys: "h" Highpass residual "l" Lowpass residual "b" Oriented bands (a list of torch.tensor)
Source code in odak/learn/perception/spatial_steerable_pyramid.py
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 |
|
reconstruct_from_pyramid(pyramid)
¶
Reconstructs an input image from a steerable pyramid.
Parameters:
-
pyramid
(list of dicts of torch.tensor
) βThe steerable pyramid. Should be in the same format as output by construct_steerable_pyramid(). The number of channels should match num_channels when the pyramid maker was created.
Returns:
-
image
(tensor
) βThe reconstructed image, in NCHW format.
Source code in odak/learn/perception/spatial_steerable_pyramid.py
display_color_hvs
¶
Source code in odak/learn/perception/color_conversion.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 |
|
__call__(input_image, ground_truth, gaze=None)
¶
Evaluating an input image against a target ground truth image for a given gaze of a viewer.
Source code in odak/learn/perception/color_conversion.py
__init__(resolution=[1920, 1080], distance_from_screen=800, pixel_pitch=0.311, read_spectrum='tensor', primaries_spectrum=torch.rand(3, 301), device=torch.device('cpu'))
¶
Parameters:
-
resolution
βResolution of the display in pixels.
-
distance_from_screen
βDistance from the screen in mm.
-
pixel_pitch
βPixel pitch of the display in mm.
-
read_spectrum
βSpectrum of the display. Default is 'default' which is the spectrum of the Dell U2415 display.
-
device
βDevice to run the code on. Default is None which means the code will run on CPU.
Source code in odak/learn/perception/color_conversion.py
cone_response_to_spectrum(cone_spectrum, light_spectrum)
¶
Internal function to calculate cone response at particular light spectrum.
Parameters:
-
cone_spectrum
βSpectrum, Wavelength [2,300] tensor
-
light_spectrum
βSpectrum, Wavelength [2,300] tensor
Returns:
-
response_to_spectrum
(float
) βResponse of cone to light spectrum [1x1]
Source code in odak/learn/perception/color_conversion.py
construct_matrix_lms(l_response, m_response, s_response)
¶
Internal function to calculate cone response at particular light spectrum.
Parameters:
-
l_response
βCone response spectrum tensor (normalized response vs wavelength)
-
m_response
βCone response spectrum tensor (normalized response vs wavelength)
-
s_response
βCone response spectrum tensor (normalized response vs wavelength)
Returns:
-
lms_image_tensor
(tensor
) β3x3 LMSrgb tensor
Source code in odak/learn/perception/color_conversion.py
construct_matrix_primaries(l_response, m_response, s_response)
¶
Internal function to calculate cone response at particular light spectrum.
Parameters:
-
l_response
βCone response spectrum tensor (normalized response vs wavelength)
-
m_response
βCone response spectrum tensor (normalized response vs wavelength)
-
s_response
βCone response spectrum tensor (normalized response vs wavelength)
Returns:
-
lms_image_tensor
(tensor
) β3x3 LMSrgb tensor
Source code in odak/learn/perception/color_conversion.py
display_spectrum_response(wavelength, function)
¶
Internal function to provide light spectrum response at particular wavelength
Parameters:
-
wavelength
βWavelength in nm [400...700]
-
function
βDisplay light spectrum distribution function
Returns:
-
ligth_response_dict
(float
) βDisplay light spectrum response value
Source code in odak/learn/perception/color_conversion.py
initialize_cones_normalized()
¶
Internal function to initialize normalized L,M,S cones as normal distribution with given sigma, and mu values.
Returns:
-
l_cone_n
(tensor
) βNormalised L cone distribution.
-
m_cone_n
(tensor
) βNormalised M cone distribution.
-
s_cone_n
(tensor
) βNormalised S cone distribution.