Journal Publications (Google Scholar Profile)

  1. Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network
    Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, and Sam Kwong.
    IEEE Transactions on Image Processing (T-IP), vol. 29, pp. 9140-9151, September 2020.
    Abstract | Paper | Code | BibTex Abstract: Improving the aesthetic quality of images is chal- lenging and eager for the public. To address this problem, most existing algorithms are based on supervised learning methods to learn an automatic photo enhancer for paired data, which consists of low-quality photos and corresponding expert-retouched ver- sions. However, the style and characteristics of photos retouched by experts may not meet the needs or preferences of general users. In this paper, we present an unsupervised image enhance- ment generative adversarial network (UEGAN), which learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner, rather than learning on a large number of paired images. The proposed model is based on single deep GAN which embeds the modulation and attention mechanisms to capture richer global and local features. Based on the proposed model, we introduce two losses to deal with the unsupervised image enhancement: (1) fidelity loss, which is defined as a ℓ2 regularization in the feature domain of a pre-trained VGG network to ensure the content between the enhanced image and the input image is the same, and (2) quality loss that is formulated as a relativistic hinge adversarial loss to endow the input image the desired characteristics. Both quantitative and qualitative results show that the proposed model effectively improves the aesthetic quality of images. Our code is available at: https://github.com/eezkni/UEGAN.
    @article{ni2020towards,
    	title={Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network},
    	author={Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong},
    	journal={IEEE Transactions on Image Processing},
    	volume={29},
    	pages={9140--9151},
    	year={2020},
    	publisher={IEEE}
    }
  2. Color Image Demosaicing Using Progressive Collaborative Representation
    Zhangkai Ni, Kai-Kuang Ma, Huanqiang Zeng, and Baojiang Zhong.
    IEEE Transactions on Image Processing (T-IP), vol. 29, pp. 4952-4964, March 2020.
    Abstract | Paper | Code | BibTex Abstract: In this paper, a progressive collaborative representation (PCR) framework is proposed that is able to incorporate any existing color image demosaicing method for further boosting its demosaicing performance. Our PCR consists of two phases: (i) offline training and (ii) online refinement. In phase (i), multiple training-and-refining stages will be performed. In each stage, a new dictionary will be established through the learning of a large number of feature-patch pairs, extracted from the demosaicked images of the current stage and their corresponding original full-color images. After training, a projection matrix will be generated and exploited to refine the current demosaicked image. The updated image with improved image quality will be used as the input for the next training-and-refining stage and performed the same processing likewise. At the end of phase (i), all the projection matrices generated as above-mentioned will be exploited in phase (ii) to conduct online demosaicked image refinement of the test image. Extensive simulations conducted on two commonly-used test datasets (i.e., the IMAX and Kodak) for evaluating the demosaicing algorithms have clearly demonstrated that our proposed PCR framework is able to constantly boost the performance of any image demosaicing method we experimented, in terms of the objective and subjective performance evaluations.
    @article{ni2020color,
    	title={Color Image Demosaicing Using Progressive Collaborative Representation},
    	author={Zhangkai Ni, Kai-Kuang Ma, Huanqiang Zeng, Baojiang Zhong},
    	journal={IEEE Transactions on Image Processing},
    	volume={29},
    	number={1},
    	pages={4952--4964},
    	year={2020},
    	publisher={IEEE}
    }
  3. Just Noticeable Distortion Profile Inference: A Patch-level Structural Visibility Learning Approach
    Xuelin Shen, Zhangkai Ni, Wenhan Yang, Shiqi Wang, Xinfeng Zhang, and Sam Kwong.
    IEEE Transactions on Image Processing (T-IP), vol. 30, pp. 26-38, November 2020.
    Abstract | Paper | Code | Dataset | Project | BibTex Abstract: In this paper, we propose an effective approach to infer the just noticeable distortion (JND) profile based on patch- level structural visibility learning. Instead of pixel-level JND profile estimation, the image patch, which is regarded as the basic processing unit to better correlate with the human perception, can be further decomposed into three conceptually independent components for visibility estimation. In particular, to incorporate the structural degradation into the patch-level JND model, a deep learning-based structural degradation estimation model is trained to approximate the masking of structural visibility. In order to facilitate the learning process, a JND dataset is further established, including 202 pristine images and 7878 distorted images generated by advanced compression algorithms based on the upcoming Versatile Video Coding (VVC) standard. Extensive experimental results further show the superiority of the proposed approach over the state-of-the-art. Our dataset is available at: https://shenxuelin-cityu.github.io/jnd.html.
    @article{shen2020just,
    	title={Just Noticeable Distortion Profile Inference: A Patch-level Structural Visibility Learning Approach},
    	author={Xuelin Shen, Zhangkai Ni, Wenhan Yang, Xinfeng Zhang, Shiqi Wang, and Sam Kwong},
    	journal={IEEE Transactions on Image Processing},
    	volume={30},
    	pages={26--38},
    	year={2020},
    	publisher={IEEE}
    }
  4. Unimodal Model-Based Inter Mode Decision for High Efficiency Video Coding
    Huanqiang Zeng, Wenjie Xiang, Jing Chen, Canhui Cai, Zhangkai Ni, and Kai-Kuang Ma.
    IEEE Access, vol.7, pp. 27936-27947, February 2019.
    Abstract | Paper | BibTex Abstract: In this paper, a fast inter mode decision algorithm, called the unimodal model-based inter mode decision (UMIMD), is proposed for the latest video coding standard, the high-efficiency video coding. Through extensive simulations, it has been observed that a unimodal model (i.e., with only one global minimum value) can be established among the size of different prediction unit (PU) modes and their resulted rate-distortion (RD) costs for each quad-tree partitioned coding tree unit (CTU). To guarantee the unimodality and further search the optimal operating point over this function for each CTU, all the PU modes need to be first classified into 11 mode classes according to their sizes. These classes are then properly ordered and sequentially checked according to the class index, from small to large so that the optimal mode can be early identified by checking when the RD cost starts to arise. In addition, an effective instant SKIP mode termination scheme is developed by simply checking the SKIP mode against a pre-determined threshold to further reduce the computational complexity. The extensive simulation results have shown that the proposed UMIMD algorithm is able to individually achieve a significant reduction on computational complexity at the encoder by 61.9% and 64.2% on average while incurring only 1.7% and 2.1% increment on the total Bjontegaard delta bit rate (BDBR) for the low delay and random access test conditions, compared with the exhaustive mode decision in the HEVC. Moreover, the experimental results have further demonstrated that the proposed UMIMD algorithm outperforms multiple state-of-the-art methods.
    @article{zeng2019unimodal,
    	title={Unimodal Model-Based Inter Mode Decision for High Efficiency Video Coding},
    	author={Zeng, Huanqiang and Xiang, Wenjie and Chen, Jing and Cai, Canhui and Ni, Zhangkai and Ma, Kai-Kuang},
    	journal={IEEE Access},
    	year={2019},
    	publisher={IEEE}
    }
  5. A Gabor Feature-Based Quality Assessment Model for the Screen Content Images
    Zhangkai Ni, Huanqiang Zeng, Lin Ma, Junhui Hou, Jing Chen, and Kai-Kuang Ma.
    IEEE Transactions on Image Processing (T-IP), vol. 27, no. 9, pp. 4516-4528, September 2018.
    Abstract | Paper | Code | Project | BibTex Abstract: In this paper, an accurate and efficient full-reference image quality assessment (IQA) model using the extracted Gabor features, called Gabor feature-based model (GFM), is proposed for conducting objective evaluation of screen content images (SCIs). It is well-known that the Gabor filters are highly consistent with the response of the human visual system (HVS), and the HVS is highly sensitive to the edge information. Based on these facts, the imaginary part of the Gabor filter that has odd symmetry and yields edge detection is exploited to the luminance of the reference and distorted SCI for extracting their Gabor features, respectively. The local similarities of the extracted Gabor features and two chrominance components, recorded in the LMN color space, are then measured independently. Finally, the Gabor-feature pooling strategy is employed to combine these measurements and generate the final evaluation score. Experimental simulation results obtained from two large SCI databases have shown that the proposed GFM model not only yields a higher consistency with the human perception on the assessment of SCIs but also requires a lower computational complexity, compared with that of classical and state-of-the-art IQA models.
    @article{ni2018gabor,
    	title={A Gabor feature-based quality assessment model for the screen content images},
    	author={Ni, Zhangkai and Zeng, Huanqiang and Ma, Lin and Hou, Junhui and Chen, Jing and Ma, Kai-Kuang},
    	journal={IEEE Transactions on Image Processing},
    	volume={27},
    	number={9},
    	pages={4516--4528},
    	year={2018},
    	publisher={IEEE}
    }
  6. Screen Content Image Quality Assessment Using Multi-Scale Difference of Gaussian
    Ying Fu, Huanqiang Zeng, Lin Ma, Zhangkai Ni, Jianqing Zhu, and Kai-Kuang Ma.
    IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), vol. 28, no. 9, pp. 2428-2432, September 2018.
    Abstract | Paper | Code | BibTex Abstract: In this paper, a novel image quality assessment (IQA) model for the screen content images (SCIs) is proposed by using multi-scale difference of Gaussian (MDOG). Motivated by the observation that the human visual system (HVS) is sensitive to the edges while the image details can be better explored in different scales, the proposed model exploits MDOG to effectively characterize the edge information of the reference and distorted SCIs at two different scales, respectively. Then, the degree of edge similarity is measured in terms of the smaller-scale edge map. Finally, the edge strength computed based on the larger-scale edge map is used as the weighting factor to generate the final SCI quality score. Experimental results have shown that the proposed IQA model for the SCIs produces high consistency with human perception of the SCI quality and outperforms the state-of-the-art quality models.
    @article{fu2018screen,
    	title={Screen content image quality assessment using multi-scale difference of gaussian},
    	author={Fu, Ying and Zeng, Huanqiang and Ma, Lin and Ni, Zhangkai and Zhu, Jianqing and Ma, Kai-Kuang},
    	journal={IEEE Transactions on Circuits and Systems for Video Technology},
    	volume={28},
    	number={9},
    	pages={2428--2432},
    	year={2018},
    	publisher={IEEE}
    }
  7. ESIM: Edge Similarity for Screen Content Image Quality Assessment
    Zhangkai Ni, Lin Ma, Huanqiang Zeng, Jing Chen, Canhui Cai, and Kai-Kuang Ma.
    IEEE Transactions on Image Processing (T-IP), vol. 26, no. 10, pp. 4818-4831, October 2017.
    Abstract | Paper | Code | Dataset | Project | BibTex Abstract: In this paper, an accurate full-reference image quality assessment (IQA) model developed for assessing screen content images (SCIs), called the edge similarity (ESIM), is proposed. It is inspired by the fact that the human visual system (HVS) is highly sensitive to edges that are often encountered in SCIs; therefore, essential edge features are extracted and exploited for conducting IQA for the SCIs. The key novelty of the proposed ESIM lies in the extraction and use of three salient edge features-i.e., edge contrast, edge width, and edge direction. The first two attributes are simultaneously generated from the input SCI based on a parametric edge model, while the last one is derived directly from the input SCI. The extraction of these three features will be performed for the reference SCI and the distorted SCI, individually. The degree of similarity measured for each above-mentioned edge attribute is then computed independently, followed by combining them together using our proposed edge-width pooling strategy to generate the final ESIM score. To conduct the performance evaluation of our proposed ESIM model, a new and the largest SCI database (denoted as SCID) is established in our work and made to the public for download. Our database contains 1800 distorted SCIs that are generated from 40 reference SCIs. For each SCI, nine distortion types are investigated, and five degradation levels are produced for each distortion type. Extensive simulation results have clearly shown that the proposed ESIM model is more consistent with the perception of the HVS on the evaluation of distorted SCIs than the multiple state-of-the-art IQA methods.
    @article{ni2017esim,
    	title={ESIM: Edge similarity for screen content image quality assessment},
    	author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Chen, Jing and Cai, Canhui and Ma, Kai-Kuang},
    	journal={IEEE Transactions on Image Processing},
    	volume={26},
    	number={10},
    	pages={4818--4831},
    	year={2017},
    	publisher={IEEE}
    }
  8. Gradient Direction for Screen Content Image Quality Assessment
    Zhangkai Ni, Lin Ma, Huanqiang Zeng, Canhui Cai, and Kai-Kuang Ma.
    IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1394–1398, August 2016.
    Abstract | Paper | Code | Project | BibTex Abstract: In this letter, we make the first attempt to explore the usage of the gradient direction to conduct the perceptual quality assessment of the screen content images (SCIs). Specifically, the proposed approach first extracts the gradient direction based on the local information of the image gradient magnitude, which not only preserves gradient direction consistency in local regions, but also demonstrates sensitivities to the distortions introduced to the SCI. A deviation-based pooling strategy is subsequently utilized to generate the corresponding image quality index. Moreover, we investigate and demonstrate the complementary behaviors of the gradient direction and magnitude for SCI quality assessment. By jointly considering them together, our proposed SCI quality metric outperforms the state-of-the-art quality metrics in terms of correlation with human visual system perception.
    @article{ni2016gradient,
    	title={Gradient direction for screen content image quality assessment},
    	author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Cai, Canhui and Ma, Kai-Kuang},
    	journal={IEEE Signal Processing Letters},
    	volume={23},
    	number={10},
    	pages={1394--1398},
    	year={2016},
    	publisher={IEEE}
    }

Conference Publications

  1. Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network
    Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, and Sam Kwong.
    In Proceedings of the 28th ACM International Conference on Multimedia (ACM Multimedia), pp. 1697-1705, October 2020
    Abstract | Paper | Code | BibTex Abstract: In this work, we aim to learn an unpaired image enhancement model, which can enrich low-quality images with the characteristics of high-quality images provided by users. We propose a quality attention generative adversarial network (QAGAN) trained on unpaired data based on the bidirectional Generative Adversarial Network (GAN) embedded with a quality attention module (QAM). The key novelty of the proposed QAGAN lies in the injected QAM for the generator such that it learns domain-relevant quality attention directly from the two domains. More specifically, the proposed QAM allows the generator to effectively select semantic-related characteristics from the spatial-wise and adaptively incorporate style-related attributes from the channel-wise, respectively. Therefore, in our proposed QAGAN, not only discriminators but also the generator can directly access both domains which significantly facilitate the generator to learn the mapping function. Extensive experimental results show that, compared with the state-of-the-art methods based on unpaired learning, our proposed method achieves better performance in both objective and subjective evaluations.
    @article{ni2020towards,
    	title={Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network},
    	author={Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong},
    	booktitle={Proceedings of the 24th ACM international conference on Multimedia},
    	year={2020}
    }
  2. A JND Dataset Based on VVC Compressed Images
    Xuelin Shen, Zhangkai Ni, Wenhan Yang, Xinfeng Zhang, Shiqi Wang, and Sam Kwong.
    2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), June 2020.
    Abstract | Paper | Dataset | BibTex Abstract: In this paper, we establish a just noticeable distortion (JND) dataset based on the next generation video coding standard Versatile Video Coding (VVC). The dataset consists of 202 images which cover a wide range of content with resolution 1920×1080. Each image is encoded by VTM 5.0 intra coding with the quantization parameter (QP) ranging from 13 to 51. The details regarding dataset construction, subjective testing and data post-processing are described in this paper. Finally, the significance of the dataset towards future video coding research is envisioned. All source images as well as the testing data have been made available to the public.
    @inproceedings{shen2020jnd,
    	title={A JND Dataset Based on VVC Compressed Images},
    	author={Shen, Xuelin and Ni, Zhangkai and Yang, Wenhan and Zhang, Xinfeng and Wang, Shiqi and Kwong, Sam},
    	booktitle={2020 IEEE International Conference on Multimedia \& Expo Workshops (ICMEW)},
    	pages={1--6},
    	year={2020},
    	organization={IEEE}
    }
  3. SCID: A Database for Screen Content Images Quality Assessment
    Zhangkai Ni, Lin Ma, Huanqiang Zeng, Ying Fu, Lu Xing, and Kai-Kuang Ma.
    International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 774-779, November 2017.
    Abstract | Paper | Dataset | Project | BibTex Abstract: Perceptual quality assessment of screen content images (SCIs) has become a new challenging topic in the recent research of image quality assessment (IQA). In this work, we construct a new SCI database (called as SCID) for subjective quality evaluate of SCIs and investigate whether existing IQA models can effectively assess the perceptual quality of distorted SCIs. The proposed SCID, which is currently the largest one, containing 1,800 distorted SCIs generated from 40 reference SCIs with 9 types of distortions and 5 degradation levels for each distortion type. The double-stimulus impairment scale (DSIS) method is then employed to rate the perceptual quality, in which each image is evaluated by at least 40 assessors. After processing, each distorted SCI is accompanied with one mean opinion score (MOS) value to indicate its perceptual quality as ground truth. Based on the constructed SCID, we evaluate the performances of 14 state-of-the-art IQA metrics. Experimental results show that the existing IQA metrics do not be able to evaluate the perceptual quality of SCIs well and an IQA metric specifically for SCIs is thus desirable. The proposed SCID will be made publicly available to the research community for further investigation on the perceptual processing of SCIs.
    @inproceedings{ni2017scid,
    	title={SCID: A database for screen content images quality assessment},
    	author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Fu, Ying and Xing, Lu and Ma, Kai-Kuang},
    	booktitle={2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)},
    	pages={774--779},
    	year={2017},
    	organization={IEEE}
    }
  4. Screen Content Image Quality Assessment Using Euclidean Distance
    Ying Fu, Huanqiang Zeng, Zhangkai Ni, Jing Chen, Canhui Cai, and Kai-Kuang Ma.
    International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 44-49, November 2017.
    Abstract | Paper | BibTex Abstract: Considering that human visual system (HVS) is greatly sensitive to edge, in this study, we design a new full-reference objective quality assessment method for screen content images (SCIs). The key novelty lies in the extracting of the edge information by computing the Euclidean distance of luminance in the SCIs. Since HVS is greatly suitable for extracting structural information, the structure information is incorporated into our proposed model. The extracted information is then used to compute the similarity maps of the reference SCI and its distorted SCI. Finally, we combine the obtained maps by using our designed pooling strategy. Experience results have shown that the designed method get higher correlation with the subjective quality score than state-of-the-art quality assessment models.
    @inproceedings{fu2017screen,
    	title={Screen content image quality assessment using Euclidean distance},
    	author={Fu, Ying and Zeug, Huanqiang and Ni, Zhangkai and Chen, Jing and Cai, Canhui and Ma, Kai-Kuang},
    	booktitle={2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)},
    	pages={44--49},
    	year={2017},
    	organization={IEEE}
    }
  5. Screen Content Image Quality Assessment Using Edge Model
    Zhangkai Ni, Lin Ma, Huanqiang Zeng, Canhui Cai, and Kai-Kuang Ma.
    IEEE International conference on Image Processing (ICIP), pp. 81–85, August 2016.
    Abstract | Paper | Code | BibTex Abstract: Since the human visual system (HVS) is highly sensitive to edges, a novel image quality assessment (IQA) metric for assessing screen content images (SCIs) is proposed in this paper. The turnkey novelty lies in the use of an existing parametric edge model to extract two types of salient attributes - namely, edge contrast and edge width, for the distorted SCI under assessment and its original SCI, respectively. The extracted information is subject to conduct similarity measurements on each attribute, independently. The obtained similarity scores are then combined using our proposed edge-width pooling strategy to generate the final IQA score. Hopefully, this score is consistent with the judgment made by the HVS. Experimental results have shown that the proposed IQA metric produces higher consistency with that of the HVS on the evaluation of the image quality of the distorted SCI than that of other state-of-the-art IQA metrics.
    @inproceedings{ni2016screen,
    	title={Screen content image quality assessment using edge model},
    	author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Cai, Canhui and Ma, Kai-Kuang},
    	booktitle={2016 IEEE International Conference on Image Processing (ICIP)},
    	pages={81--85},
    	year={2016},
    	organization={IEEE}
    }

Patents

  1. A Rain Removal Image Post-processing Method Based on Progressive Collaborative Representation
    Huanqiang Zeng, Xiangwei Lin, Zhangkai Ni, Jiuwen Cao, Jianqing Zhu, and Kai-Kuang Ma
    Application No. 10201906356T, July 2019. (Chinese Patent)
  2. Colour Image Demosaicing Using Progressive Collaborative Representation
    Kai-Kuang Ma, and Zhangkai Ni
    Application No. 10201906356T, July 2019. (Singapore Patent)
  3. A Multi-Exposure Fused Image Quality Assessment Method Based on Contrast and Saturation
    Huanqiang Zeng, Lu Xing, Zhangkai Ni, Jiuwen Cao, Canhui Cai, and Kai-Kuang Ma
    Application No. 2016111584053, December 2016. (Chinese Patent)
  4. A Screen Content Image Quality Assessment Method Based on Phase Congruency
    Huanqiang Zeng, Zhangkai Ni, Lin Ma, Jiuwen Cao, Canhui Cai, and Kai-Kuang Ma
    Application No. 2016108863395, October 2016. (Chinese Patent)
Top