So the lowest number of points is 0 while the highest is 3. For each comparison where the image was selected it gets one point. Once you have each image appear in comparisons 3 times you can calculate its score. It is clear that when labeling images from thousands of products it may get expensive quickly if you use a high number of images per product. The number of possible combinations C(n, r) for annotating 4 images is a reasonable option which only requires 6 comparisons. For n images we take C (n,r) pairs where r is the number of images in pair (r = 2). In our pairwise comparison approach each image is shown the same number of times in different pairs for a product. This approach reduces the mental strain on the annotator and greatly simplifies the process. So we decided it was better to have a scenario where an annotator is shown two images side by side (pairwise comparison) and asked to select the most helpful in informing a purchasing decision. In other words, after a while most images start looking somewhat OK to the annotator. Second, annotator scores start shifting to the median value of the scale rather quickly and you may end up with a significant part of the dataset rated around 5 or 6 out of 10. Due to the challenging nature of the task it may take a considerable amount of time. First of all, it is challenging to assign a score to an image without seeing enough points of reference and getting to know the domain better. However, after reviewing the literature on the topic we discovered an issue with such an approach. The first thing that may come to mind is to show an image and ask an annotator to rate it on the scale of 1 to 10. In our case we used five annotators for each example.Īnother important question was the way of assessing the image. So it was necessary to have multiple annotators for each image to reduce the bias. Another person may like bright photos over normal lighting. For example, one person prefers yellow over blue and may like a photo of a yellow couch better. Creating such a dataset may be challenging because image quality is rather subjective. The next step was to create a dataset on which we could evaluate the selected models. However, our evaluation metric was based on ranking images by their quality score which made the comparison between models straightforward. It is worth noting that NIMA outputs predictions on a score ranging from 1-10 while STL-D on the range from 0 to 1. We also added an existing model (STL-D) from another team which was a binary classifier based on Inception-V3 and fine-tuned on images with damaged Wayfair products and social media photos of Wayfair products. An open source implementation of NIMA was released by Idealo and contains two models, one for evaluating technical aspects of images and another for the aesthetic side of images. So we started looking at existing Computer Vision models for image quality assessment.Īfter doing research on the existing models we decided to use the NIMA (Neural Image Assessment) model. However, our experiments demonstrated that such aspects were not enough to create a reliable solution. Our first attempt was to look at technical aspects of photos such as brightness and contrast. This change saves time and provides better information for our customers. We decided to create a solution that can order images in the UGC photo carousel so images with the highest quality are shown first. Therefore, we saw great potential in User Generated Content in inspiring and informing our customers. In this case customers cannot see how the product may look like in a real life environment e.g. shot from a specific angle on a white background. Moreover, supplier images may not be diverse enough e.g. First of all, there may be too few supplier images to get enough information about the product. However, images from our suppliers may not be enough. Working on the belief that consumers weigh product images heavily in purchase decisions, we decided to explore opportunities to sort images based on image quality. In previous work we confirmed that conversion rate (CVR) is highly responsive to the quality and relevance of product information. As a result, our customers rely significantly on product images when weighing their purchase. We will further refer to such images as good quality images.Īt Wayfair, we sell products online where our customers cannot try or see products in person. At Wayfair we care deeply about customer satisfaction and experience and want to help our customers see good looking photos that can help make an informed decision. They have odd angles, framing of the objects in the photo are far from perfect, poor colors and lighting, etc. You probably noticed that quite a few of the customer photos do not look inspiring or informative for the potential buyers. Product Detail Page User Generated Content
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |