Preprints (Google Scholar Profile)

  1. High Dynamic Range Image Quality Assessment Based on Frequency Disparity
    Yue Liu, Zhangkai Ni, Shiqi Wang, Hanli Wang, and Sam Kwong.
    Abstract | Paper | Code | BibTex Abstract: In this paper, a novel and effective image quality assessment (IQA) algorithm based on frequency disparity for high dynamic range (HDR) images is proposed, termed as local-global frequency feature-based model (LGFM). Motivated by the assumption that the human visual system is highly adapted for extracting structural information and partial frequencies when perceiving the visual scene, the Gabor and the Butterworth filters are applied to the luminance of the HDR image to extract local and global frequency features, respectively. The similarity measurement and feature pooling are sequentially performed on the frequency features to obtain the predicted quality score. The experiments evaluated on four widely used benchmarks demonstrate that the proposed LGFM can provide a higher consistency with the subjective perception compared with the state-of-the-art HDR IQA methods.
    @article{liu2022high,
    	title={High Dynamic Range Image Quality Assessment Based on Frequency Disparity},
    	author={Liu, Yue and Ni, Zhangkai and Wang, Shiqi and Wang, Hanli and Kwong, Sam},
    	journal={arXiv preprint arXiv:2209.02285},
    	year={2022}
    }
  2. Just Noticeable Difference Modeling for Face Recognition System
    Yu Tian, Zhangkai Ni, Baoliang Chen, Shurun Wang, Shiqi Wang, Hanli Wang, and Sam Kwong.
    Abstract | Paper | Code | BibTex Abstract: High-quality face images are required to guarantee the stability and reliability of automatic face recognition (FR) systems in surveillance and security scenarios. However, a massive amount of face data is usually compressed before being analyzed due to limitations on transmission or storage. The compressed images may lose the powerful identity information, resulting in the performance degradation of the FR system. Herein, we make the first attempt to study just noticeable difference (JND) for the FR system, which can be defined as the maximum distortion that the FR system cannot notice. More specifically, we establish a JND dataset including 3530 original images and 137,670 compressed images generated by advanced reference encoding/decoding software based on the Versatile Video Coding (VVC) standard (VTM-15.0). Subsequently, we develop a novel JND prediction model to directly infer JND images for the FR system. In particular, in order to maximum redundancy removal without impairment of robust identity information, we apply the encoder with multiple feature extraction and attention-based feature decomposition modules to progressively decompose face features into two uncorrelated components, i.e., identity and residual features, via self-supervised learning. Then, the residual feature is fed into the decoder to generate the residual map. Finally, the predicted JND map is obtained by subtracting the residual map from the original image. Experimental results have demonstrated that the proposed model achieves higher accuracy of JND map prediction compared with the state-of-the-art JND models, and is capable of saving more bits while maintaining the performance of the FR system compared with VTM-15.0.
    @article{tian2022just,
    	title={Just Noticeable Difference Modeling for Face Recognition System},
    	author={Tian, Yu and Ni, Zhangkai and Chen, Baoliang and Wang, Shurun and Wang, Shiqi and Wang, Hanli and Kwong, Sam},
    	journal={arXiv preprint arXiv:2209.05856},
    	year={2022}
    }
  3. Generalized Visual Quality Assessment of GAN-Generated Face Images
    Yu Tian, Zhangkai Ni, Baoliang Chen, Shiqi Wang, Hanli Wang, and Sam Kwong.
    Abstract | Paper | Code | BibTex Abstract: Recent years have witnessed the dramatically increased interest in face generation with generative adversarial networks (GANs). A number of successful GAN algorithms have been developed to produce vivid face images towards different application scenarios. However, little work has been dedicated to automatic quality assessment of such GAN-generated face images (GFIs), even less have been devoted to generalized and robust quality assessment of GFIs generated with unseen GAN model. Herein, we make the first attempt to study the subjective and objective quality towards generalized quality assessment of GFIs. More specifically, we establish a large-scale database consisting of GFIs from four GAN algorithms, the pseudo labels from image quality assessment (IQA) measures, as well as the human opinion scores via subjective testing. Subsequently, we develop a quality assessment model that is able to deliver accurate quality predictions for GFIs from both available and unseen GAN algorithms based on meta-learning. In particular, to learn shared knowledge from GFIs pairs that are born of limited GAN algorithms, we develop the convolutional block attention (CBA) and facial attributes-based analysis (ABA) modules, ensuring that the learned knowledge tends to be consistent with human visual perception. Extensive experiments exhibit that the proposed model achieves better performance compared with the state-of-the-art IQA models, and is capable of retaining the effectiveness when evaluating GFIs from the unseen GAN algorithms.
    @article{tian2022generalized,
    	title={Generalized Visual Quality Assessment of GAN-Generated Face Images},
    	author={Tian, Yu and Ni, Zhangkai and Chen, Baoliang and Wang, Shiqi and Wang, Hanli and Kwong, Sam},
    	journal={arXiv preprint arXiv:2201.11975},
    	year={2022}
    }
  4. CSformer: Bridging Convolution and Transformer for Compressive Sensing
    Dongjie Ye, Zhangkai Ni, Hanli Wang, Jian Zhang, Shiqi Wang, and Sam Kwong.
    Abstract | Paper | Code | BibTex Abstract: Convolution neural networks (CNNs) have succeeded in compressive image sensing. However, due to the inductive bias of locality and weight sharing, the convolution operations demonstrate the intrinsic limitations in modeling the long-range dependency. Transformer, designed initially as a sequence-tosequence model, excels at capturing global contexts due to the self-attention-based architectures even though it may be equipped with limited localization abilities. This paper proposes CSformer, a hybrid framework that integrates the advantages of leveraging both detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning. The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery. In the sampling module, images are measured blockby-block by the learned sampling matrix. In the reconstruction stage, the measurement is projected into dual stems. One is the CNN stem for modeling the neighborhood relationships by convolution, and the other is the transformer stem for adopting global self-attention mechanism. The dual branches structure is concurrent, and the local features and global representations are fused under different resolutions to maximize the complementary of features. Furthermore, we explore a progressive strategy and window-based transformer block to reduce the parameter and computational complexity. The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing, which achieves superior performance compared to state-of-the-art methods on different datasets.
    @article{ye2021csformer,
    	title={CSformer: Bridging Convolution and Transformer for Compressive Sensing},
    	author={Ye, Dongjie and Ni, Zhangkai and Wang, Hanli and Zhang, Jian and Wang, Shiqi and Kwong, Sam},
    	journal={arXiv preprint arXiv:2112.15299},
    	year={2021}
    }
Journal Publications

  1. Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network
    Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, and Sam Kwong.
    IEEE Transactions on Image Processing (T-IP), vol. 29, pp. 9140-9151, September 2020.
    Abstract | Paper | Code | BibTex Abstract: Improving the aesthetic quality of images is chal- lenging and eager for the public. To address this problem, most existing algorithms are based on supervised learning methods to learn an automatic photo enhancer for paired data, which consists of low-quality photos and corresponding expert-retouched ver- sions. However, the style and characteristics of photos retouched by experts may not meet the needs or preferences of general users. In this paper, we present an unsupervised image enhance- ment generative adversarial network (UEGAN), which learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner, rather than learning on a large number of paired images. The proposed model is based on single deep GAN which embeds the modulation and attention mechanisms to capture richer global and local features. Based on the proposed model, we introduce two losses to deal with the unsupervised image enhancement: (1) fidelity loss, which is defined as a ℓ2 regularization in the feature domain of a pre-trained VGG network to ensure the content between the enhanced image and the input image is the same, and (2) quality loss that is formulated as a relativistic hinge adversarial loss to endow the input image the desired characteristics. Both quantitative and qualitative results show that the proposed model effectively improves the aesthetic quality of images. Our code is available at: https://github.com/eezkni/UEGAN.
    @article{ni2020towards,
    	title={Towards unsupervised deep image enhancement with generative adversarial network},
    	author={Ni, Zhangkai and Yang, Wenhan and Wang, Shiqi and Ma, Lin and Kwong, Sam},
    	journal={IEEE Transactions on Image Processing},
    	volume={29},
    	pages={9140--9151},
    	year={2020},
    	publisher={IEEE}
    }
  2. Color Image Demosaicing Using Progressive Collaborative Representation
    Zhangkai Ni, Kai-Kuang Ma, Huanqiang Zeng, and Baojiang Zhong.
    IEEE Transactions on Image Processing (T-IP), vol. 29, pp. 4952-4964, March 2020.
    Abstract | Paper | Code | BibTex Abstract: In this paper, a progressive collaborative representation (PCR) framework is proposed that is able to incorporate any existing color image demosaicing method for further boosting its demosaicing performance. Our PCR consists of two phases: (i) offline training and (ii) online refinement. In phase (i), multiple training-and-refining stages will be performed. In each stage, a new dictionary will be established through the learning of a large number of feature-patch pairs, extracted from the demosaicked images of the current stage and their corresponding original full-color images. After training, a projection matrix will be generated and exploited to refine the current demosaicked image. The updated image with improved image quality will be used as the input for the next training-and-refining stage and performed the same processing likewise. At the end of phase (i), all the projection matrices generated as above-mentioned will be exploited in phase (ii) to conduct online demosaicked image refinement of the test image. Extensive simulations conducted on two commonly-used test datasets (i.e., the IMAX and Kodak) for evaluating the demosaicing algorithms have clearly demonstrated that our proposed PCR framework is able to constantly boost the performance of any image demosaicing method we experimented, in terms of the objective and subjective performance evaluations.
    @article{ni2020color,
    	title={Color Image Demosaicing Using Progressive Collaborative Representation},
    	author={Zhangkai Ni, Kai-Kuang Ma, Huanqiang Zeng, Baojiang Zhong},
    	journal={IEEE Transactions on Image Processing},
    	volume={29},
    	number={1},
    	pages={4952--4964},
    	year={2020},
    	publisher={IEEE}
    }
  3. Just Noticeable Distortion Profile Inference: A Patch-level Structural Visibility Learning Approach
    Xuelin Shen, Zhangkai Ni, Wenhan Yang, Shiqi Wang, Xinfeng Zhang, and Sam Kwong.
    IEEE Transactions on Image Processing (T-IP), vol. 30, pp. 26-38, November 2020.
    Abstract | Paper | Code | Dataset | Project | BibTex Abstract: In this paper, we propose an effective approach to infer the just noticeable distortion (JND) profile based on patch- level structural visibility learning. Instead of pixel-level JND profile estimation, the image patch, which is regarded as the basic processing unit to better correlate with the human perception, can be further decomposed into three conceptually independent components for visibility estimation. In particular, to incorporate the structural degradation into the patch-level JND model, a deep learning-based structural degradation estimation model is trained to approximate the masking of structural visibility. In order to facilitate the learning process, a JND dataset is further established, including 202 pristine images and 7878 distorted images generated by advanced compression algorithms based on the upcoming Versatile Video Coding (VVC) standard. Extensive experimental results further show the superiority of the proposed approach over the state-of-the-art. Our dataset is available at: https://shenxuelin-cityu.github.io/jnd.html.
    @article{shen2020just,
    	title={Just Noticeable Distortion Profile Inference: A Patch-level Structural Visibility Learning Approach},
    	author={Xuelin Shen, Zhangkai Ni, Wenhan Yang, Xinfeng Zhang, Shiqi Wang, and Sam Kwong},
    	journal={IEEE Transactions on Image Processing},
    	volume={30},
    	pages={26--38},
    	year={2020},
    	publisher={IEEE}
    }
  4. Unimodal Model-Based Inter Mode Decision for High Efficiency Video Coding
    Huanqiang Zeng, Wenjie Xiang, Jing Chen, Canhui Cai, Zhangkai Ni, and Kai-Kuang Ma.
    IEEE Access, vol.7, pp. 27936-27947, February 2019.
    Abstract | Paper | BibTex Abstract: In this paper, a fast inter mode decision algorithm, called the unimodal model-based inter mode decision (UMIMD), is proposed for the latest video coding standard, the high-efficiency video coding. Through extensive simulations, it has been observed that a unimodal model (i.e., with only one global minimum value) can be established among the size of different prediction unit (PU) modes and their resulted rate-distortion (RD) costs for each quad-tree partitioned coding tree unit (CTU). To guarantee the unimodality and further search the optimal operating point over this function for each CTU, all the PU modes need to be first classified into 11 mode classes according to their sizes. These classes are then properly ordered and sequentially checked according to the class index, from small to large so that the optimal mode can be early identified by checking when the RD cost starts to arise. In addition, an effective instant SKIP mode termination scheme is developed by simply checking the SKIP mode against a pre-determined threshold to further reduce the computational complexity. The extensive simulation results have shown that the proposed UMIMD algorithm is able to individually achieve a significant reduction on computational complexity at the encoder by 61.9% and 64.2% on average while incurring only 1.7% and 2.1% increment on the total Bjontegaard delta bit rate (BDBR) for the low delay and random access test conditions, compared with the exhaustive mode decision in the HEVC. Moreover, the experimental results have further demonstrated that the proposed UMIMD algorithm outperforms multiple state-of-the-art methods.
    @article{zeng2019unimodal,
    	title={Unimodal Model-Based Inter Mode Decision for High Efficiency Video Coding},
    	author={Zeng, Huanqiang and Xiang, Wenjie and Chen, Jing and Cai, Canhui and Ni, Zhangkai and Ma, Kai-Kuang},
    	journal={IEEE Access},
    	year={2019},
    	publisher={IEEE}
    }
  5. A Gabor Feature-Based Quality Assessment Model for the Screen Content Images
    Zhangkai Ni, Huanqiang Zeng, Lin Ma, Junhui Hou, Jing Chen, and Kai-Kuang Ma.
    IEEE Transactions on Image Processing (T-IP), vol. 27, no. 9, pp. 4516-4528, September 2018.
    Abstract | Paper | Code | Project | BibTex Abstract: In this paper, an accurate and efficient full-reference image quality assessment (IQA) model using the extracted Gabor features, called Gabor feature-based model (GFM), is proposed for conducting objective evaluation of screen content images (SCIs). It is well-known that the Gabor filters are highly consistent with the response of the human visual system (HVS), and the HVS is highly sensitive to the edge information. Based on these facts, the imaginary part of the Gabor filter that has odd symmetry and yields edge detection is exploited to the luminance of the reference and distorted SCI for extracting their Gabor features, respectively. The local similarities of the extracted Gabor features and two chrominance components, recorded in the LMN color space, are then measured independently. Finally, the Gabor-feature pooling strategy is employed to combine these measurements and generate the final evaluation score. Experimental simulation results obtained from two large SCI databases have shown that the proposed GFM model not only yields a higher consistency with the human perception on the assessment of SCIs but also requires a lower computational complexity, compared with that of classical and state-of-the-art IQA models.
    @article{ni2018gabor,
    	title={A Gabor feature-based quality assessment model for the screen content images},
    	author={Ni, Zhangkai and Zeng, Huanqiang and Ma, Lin and Hou, Junhui and Chen, Jing and Ma, Kai-Kuang},
    	journal={IEEE Transactions on Image Processing},
    	volume={27},
    	number={9},
    	pages={4516--4528},
    	year={2018},
    	publisher={IEEE}
    }
  6. Screen Content Image Quality Assessment Using Multi-Scale Difference of Gaussian
    Ying Fu, Huanqiang Zeng, Lin Ma, Zhangkai Ni, Jianqing Zhu, and Kai-Kuang Ma.
    IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), vol. 28, no. 9, pp. 2428-2432, September 2018.
    Abstract | Paper | Code | BibTex Abstract: In this paper, a novel image quality assessment (IQA) model for the screen content images (SCIs) is proposed by using multi-scale difference of Gaussian (MDOG). Motivated by the observation that the human visual system (HVS) is sensitive to the edges while the image details can be better explored in different scales, the proposed model exploits MDOG to effectively characterize the edge information of the reference and distorted SCIs at two different scales, respectively. Then, the degree of edge similarity is measured in terms of the smaller-scale edge map. Finally, the edge strength computed based on the larger-scale edge map is used as the weighting factor to generate the final SCI quality score. Experimental results have shown that the proposed IQA model for the SCIs produces high consistency with human perception of the SCI quality and outperforms the state-of-the-art quality models.
    @article{fu2018screen,
    	title={Screen content image quality assessment using multi-scale difference of gaussian},
    	author={Fu, Ying and Zeng, Huanqiang and Ma, Lin and Ni, Zhangkai and Zhu, Jianqing and Ma, Kai-Kuang},
    	journal={IEEE Transactions on Circuits and Systems for Video Technology},
    	volume={28},
    	number={9},
    	pages={2428--2432},
    	year={2018},
    	publisher={IEEE}
    }
  7. ESIM: Edge Similarity for Screen Content Image Quality Assessment
    Zhangkai Ni, Lin Ma, Huanqiang Zeng, Jing Chen, Canhui Cai, and Kai-Kuang Ma.
    IEEE Transactions on Image Processing (T-IP), vol. 26, no. 10, pp. 4818-4831, October 2017.
    Abstract | Paper | Code | Dataset | Project | BibTex Abstract: In this paper, an accurate full-reference image quality assessment (IQA) model developed for assessing screen content images (SCIs), called the edge similarity (ESIM), is proposed. It is inspired by the fact that the human visual system (HVS) is highly sensitive to edges that are often encountered in SCIs; therefore, essential edge features are extracted and exploited for conducting IQA for the SCIs. The key novelty of the proposed ESIM lies in the extraction and use of three salient edge features-i.e., edge contrast, edge width, and edge direction. The first two attributes are simultaneously generated from the input SCI based on a parametric edge model, while the last one is derived directly from the input SCI. The extraction of these three features will be performed for the reference SCI and the distorted SCI, individually. The degree of similarity measured for each above-mentioned edge attribute is then computed independently, followed by combining them together using our proposed edge-width pooling strategy to generate the final ESIM score. To conduct the performance evaluation of our proposed ESIM model, a new and the largest SCI database (denoted as SCID) is established in our work and made to the public for download. Our database contains 1800 distorted SCIs that are generated from 40 reference SCIs. For each SCI, nine distortion types are investigated, and five degradation levels are produced for each distortion type. Extensive simulation results have clearly shown that the proposed ESIM model is more consistent with the perception of the HVS on the evaluation of distorted SCIs than the multiple state-of-the-art IQA methods.
    @article{ni2017esim,
    	title={ESIM: Edge similarity for screen content image quality assessment},
    	author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Chen, Jing and Cai, Canhui and Ma, Kai-Kuang},
    	journal={IEEE Transactions on Image Processing},
    	volume={26},
    	number={10},
    	pages={4818--4831},
    	year={2017},
    	publisher={IEEE}
    }
  8. Gradient Direction for Screen Content Image Quality Assessment
    Zhangkai Ni, Lin Ma, Huanqiang Zeng, Canhui Cai, and Kai-Kuang Ma.
    IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1394–1398, August 2016.
    Abstract | Paper | Code | Project | BibTex Abstract: In this letter, we make the first attempt to explore the usage of the gradient direction to conduct the perceptual quality assessment of the screen content images (SCIs). Specifically, the proposed approach first extracts the gradient direction based on the local information of the image gradient magnitude, which not only preserves gradient direction consistency in local regions, but also demonstrates sensitivities to the distortions introduced to the SCI. A deviation-based pooling strategy is subsequently utilized to generate the corresponding image quality index. Moreover, we investigate and demonstrate the complementary behaviors of the gradient direction and magnitude for SCI quality assessment. By jointly considering them together, our proposed SCI quality metric outperforms the state-of-the-art quality metrics in terms of correlation with human visual system perception.
    @article{ni2016gradient,
    	title={Gradient direction for screen content image quality assessment},
    	author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Cai, Canhui and Ma, Kai-Kuang},
    	journal={IEEE Signal Processing Letters},
    	volume={23},
    	number={10},
    	pages={1394--1398},
    	year={2016},
    	publisher={IEEE}
    }

Conference Publications

  1. Cycle-Interactive Generative Adversarial Network for Robust Unsupervised Low-Light Enhancement
    Zhangkai Ni, Wenhan Yang, Hanli Wang, Shiqi Wang, Lin Ma, and Sam Kwong.
    In Proceedings of the 30th ACM International Conference on Multimedia (ACM Multimedia), October 2022
    Abstract | Paper | Code | BibTex Abstract: Getting rid of the fundamental limitations in fitting to the paired training data, recent unsupervised low-light enhancement methods excel in adjusting illumination and contrast of images. However, for unsupervised low light enhancement, the remaining noise suppression issue due to the lacking of supervision of detailed signal largely impedes the wide deployment of these methods in real-world applications. Herein, we propose a novel Cycle-Interactive Generative Adversarial Network (CIGAN) for unsupervised low-light image enhancement, which is capable of not only better transferring illumination distributions between low/normal-light images but also manipulating detailed signals between two domains, e.g., suppressing/synthesizing realistic noise in the cyclic enhancement/degradation process. In particular, the proposed low-light guided transformation feed-forwards the features of low-light images from the generator of enhancement GAN (eGAN) into the generator of degradation GAN (dGAN). With the learned information of real low-light images, dGAN can synthesize more realistic diverse illumination and contrast in low-light images. Moreover, the feature randomized perturbation module in dGAN learns to increase the feature randomness to produce diverse feature distributions, persuading the synthesized low-light images to contain realistic noise. Extensive experiments demonstrate both the superiority of the proposed method and the effectiveness of each module in CIGAN.
    @inproceedings{ni2022cycle,
    	title={Cycle-Interactive Generative Adversarial Network for Robust Unsupervised Low-Light Enhancement},
    	author={Ni, Zhangkai and Yang, Wenhan and Wang, Hanli and Wang, Shiqi and Ma, Lin and Kwong, Sam},
    	booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
    	pages={1697--1705},
    	year={2020}
    }
  2. Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network
    Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, and Sam Kwong.
    In Proceedings of the 28th ACM International Conference on Multimedia (ACM Multimedia), pp. 1697-1705, October 2020
    Abstract | Paper | Code | BibTex Abstract: In this work, we aim to learn an unpaired image enhancement model, which can enrich low-quality images with the characteristics of high-quality images provided by users. We propose a quality attention generative adversarial network (QAGAN) trained on unpaired data based on the bidirectional Generative Adversarial Network (GAN) embedded with a quality attention module (QAM). The key novelty of the proposed QAGAN lies in the injected QAM for the generator such that it learns domain-relevant quality attention directly from the two domains. More specifically, the proposed QAM allows the generator to effectively select semantic-related characteristics from the spatial-wise and adaptively incorporate style-related attributes from the channel-wise, respectively. Therefore, in our proposed QAGAN, not only discriminators but also the generator can directly access both domains which significantly facilitate the generator to learn the mapping function. Extensive experimental results show that, compared with the state-of-the-art methods based on unpaired learning, our proposed method achieves better performance in both objective and subjective evaluations.
    @inproceedings{ni2020unpaired,
    	title={Unpaired image enhancement with quality-attention generative adversarial network},
    	author={Ni, Zhangkai and Yang, Wenhan and Wang, Shiqi and Ma, Lin and Kwong, Sam},
    	booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
    	pages={1697--1705},
    	year={2020}
    }
  3. A JND Dataset Based on VVC Compressed Images
    Xuelin Shen, Zhangkai Ni, Wenhan Yang, Xinfeng Zhang, Shiqi Wang, and Sam Kwong.
    2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), June 2020.
    Abstract | Paper | Dataset | BibTex Abstract: In this paper, we establish a just noticeable distortion (JND) dataset based on the next generation video coding standard Versatile Video Coding (VVC). The dataset consists of 202 images which cover a wide range of content with resolution 1920×1080. Each image is encoded by VTM 5.0 intra coding with the quantization parameter (QP) ranging from 13 to 51. The details regarding dataset construction, subjective testing and data post-processing are described in this paper. Finally, the significance of the dataset towards future video coding research is envisioned. All source images as well as the testing data have been made available to the public.
    @inproceedings{shen2020jnd,
    	title={A JND Dataset Based on VVC Compressed Images},
    	author={Shen, Xuelin and Ni, Zhangkai and Yang, Wenhan and Zhang, Xinfeng and Wang, Shiqi and Kwong, Sam},
    	booktitle={2020 IEEE International Conference on Multimedia \& Expo Workshops (ICMEW)},
    	pages={1--6},
    	year={2020},
    	organization={IEEE}
    }
  4. SCID: A Database for Screen Content Images Quality Assessment
    Zhangkai Ni, Lin Ma, Huanqiang Zeng, Ying Fu, Lu Xing, and Kai-Kuang Ma.
    International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 774-779, November 2017.
    Abstract | Paper | Dataset | Project | BibTex Abstract: Perceptual quality assessment of screen content images (SCIs) has become a new challenging topic in the recent research of image quality assessment (IQA). In this work, we construct a new SCI database (called as SCID) for subjective quality evaluate of SCIs and investigate whether existing IQA models can effectively assess the perceptual quality of distorted SCIs. The proposed SCID, which is currently the largest one, containing 1,800 distorted SCIs generated from 40 reference SCIs with 9 types of distortions and 5 degradation levels for each distortion type. The double-stimulus impairment scale (DSIS) method is then employed to rate the perceptual quality, in which each image is evaluated by at least 40 assessors. After processing, each distorted SCI is accompanied with one mean opinion score (MOS) value to indicate its perceptual quality as ground truth. Based on the constructed SCID, we evaluate the performances of 14 state-of-the-art IQA metrics. Experimental results show that the existing IQA metrics do not be able to evaluate the perceptual quality of SCIs well and an IQA metric specifically for SCIs is thus desirable. The proposed SCID will be made publicly available to the research community for further investigation on the perceptual processing of SCIs.
    @inproceedings{ni2017scid,
    	title={SCID: A database for screen content images quality assessment},
    	author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Fu, Ying and Xing, Lu and Ma, Kai-Kuang},
    	booktitle={2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)},
    	pages={774--779},
    	year={2017},
    	organization={IEEE}
    }
  5. Screen Content Image Quality Assessment Using Euclidean Distance
    Ying Fu, Huanqiang Zeng, Zhangkai Ni, Jing Chen, Canhui Cai, and Kai-Kuang Ma.
    International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 44-49, November 2017.
    Abstract | Paper | BibTex Abstract: Considering that human visual system (HVS) is greatly sensitive to edge, in this study, we design a new full-reference objective quality assessment method for screen content images (SCIs). The key novelty lies in the extracting of the edge information by computing the Euclidean distance of luminance in the SCIs. Since HVS is greatly suitable for extracting structural information, the structure information is incorporated into our proposed model. The extracted information is then used to compute the similarity maps of the reference SCI and its distorted SCI. Finally, we combine the obtained maps by using our designed pooling strategy. Experience results have shown that the designed method get higher correlation with the subjective quality score than state-of-the-art quality assessment models.
    @inproceedings{fu2017screen,
    	title={Screen content image quality assessment using Euclidean distance},
    	author={Fu, Ying and Zeug, Huanqiang and Ni, Zhangkai and Chen, Jing and Cai, Canhui and Ma, Kai-Kuang},
    	booktitle={2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)},
    	pages={44--49},
    	year={2017},
    	organization={IEEE}
    }
  6. Screen Content Image Quality Assessment Using Edge Model
    Zhangkai Ni, Lin Ma, Huanqiang Zeng, Canhui Cai, and Kai-Kuang Ma.
    IEEE International conference on Image Processing (ICIP), pp. 81–85, August 2016.
    Abstract | Paper | Code | BibTex Abstract: Since the human visual system (HVS) is highly sensitive to edges, a novel image quality assessment (IQA) metric for assessing screen content images (SCIs) is proposed in this paper. The turnkey novelty lies in the use of an existing parametric edge model to extract two types of salient attributes - namely, edge contrast and edge width, for the distorted SCI under assessment and its original SCI, respectively. The extracted information is subject to conduct similarity measurements on each attribute, independently. The obtained similarity scores are then combined using our proposed edge-width pooling strategy to generate the final IQA score. Hopefully, this score is consistent with the judgment made by the HVS. Experimental results have shown that the proposed IQA metric produces higher consistency with that of the HVS on the evaluation of the image quality of the distorted SCI than that of other state-of-the-art IQA metrics.
    @inproceedings{ni2016screen,
    	title={Screen content image quality assessment using edge model},
    	author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Cai, Canhui and Ma, Kai-Kuang},
    	booktitle={2016 IEEE International Conference on Image Processing (ICIP)},
    	pages={81--85},
    	year={2016},
    	organization={IEEE}
    }

Patents

  1. A Rain Removal Image Post-processing Method Based on Progressive Collaborative Representation
    Huanqiang Zeng, Xiangwei Lin, Zhangkai Ni, Jiuwen Cao, Jianqing Zhu, and Kai-Kuang Ma
    Application No. 10201906356T, July 2019. (Chinese Patent)
  2. Colour Image Demosaicing Using Progressive Collaborative Representation
    Kai-Kuang Ma, and Zhangkai Ni
    Application No. 10201906356T, July 2019. (Singapore Patent)
  3. A Multi-Exposure Fused Image Quality Assessment Method Based on Contrast and Saturation
    Huanqiang Zeng, Lu Xing, Zhangkai Ni, Jiuwen Cao, Canhui Cai, and Kai-Kuang Ma
    Application No. 2016111584053, December 2016. (Chinese Patent)
  4. A Screen Content Image Quality Assessment Method Based on Phase Congruency
    Huanqiang Zeng, Zhangkai Ni, Lin Ma, Jiuwen Cao, Canhui Cai, and Kai-Kuang Ma
    Application No. 2016108863395, October 2016. (Chinese Patent)
Top