About

Hi! I’m Zhangkai NI (倪张凯), a Ph.D. candidate at the Department of Computer Science, City University of Hong Kong, jointly supervised by Prof. Sam Tak Wu KWONG (Chair Professor at HKCityU-CS, IEEE Fellow) and Dr. Shiqi WANG (Assistant Professor at HKCityU-CS). Over the years, I was lucky to have some amazing collaborators who have helped me along the way. I am fortunate enough to work closely with Prof. Kai-Kuang MA (Professor at NTU, IEEE Fellow), Prof. Huanqiang ZENG (Professor at HQU), Dr. Lin Ma (Principal Researcher at Meituan), and Dr. Wenhan YANG (Postdoctoral Fellow at CityU).

My research interests include computer vision, machine learning, and image processing. My current research is focused on generative modeling, unsupervised learning, image/video processing, and image/video quality assessment.

News

  • 2020.09: One paper on learning-based just noticeable distortion (JND) profile inference was accepted to IEEE T-IP.
  • 2020.09: One paper on unsupervised image enhancement was accepted to IEEE T-IP.
  • 2020.07: One paper on unpaired image enhancement was accepted to ACM MM 2020.
  • 2020.06: We built a Just Noticeable Distortion (JND) dataset based on the Versatile Video Coding (VVC). Dataset
  • 2020.02: One paper on color image demosaicking was accepted to IEEE T-IP.
  • 2019.04: Received the outstanding master degree thesis of the chinese institute of electronics (CIE).
  • 2018.07: One paper on SCI quality assessment based on the multi-scale difference of Gaussian was accepted to IEEE T-CSVT.
  • 2018.06: One paper on SCI quality assessment based on the gabor feature was accepted to IEEE T-IP.
  • 2017.06: One paper on SCI quality assessment based on the edge model was accepted to IEEE T-IP.

SHOW MORE

Professional Experiences

Selected Publications

Dummy Image
Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network
Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, and Sam Kwong.
IEEE Transactions on Image Processing (T-IP), vol. 29, pp. 9140-9151, September 2020.
Abstract | Paper | Code | BibTex Abstract: Improving the aesthetic quality of images is chal- lenging and eager for the public. To address this problem, most existing algorithms are based on supervised learning methods to learn an automatic photo enhancer for paired data, which consists of low-quality photos and corresponding expert-retouched ver- sions. However, the style and characteristics of photos retouched by experts may not meet the needs or preferences of general users. In this paper, we present an unsupervised image enhance- ment generative adversarial network (UEGAN), which learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner, rather than learning on a large number of paired images. The proposed model is based on single deep GAN which embeds the modulation and attention mechanisms to capture richer global and local features. Based on the proposed model, we introduce two losses to deal with the unsupervised image enhancement: (1) fidelity loss, which is defined as a ℓ2 regularization in the feature domain of a pre-trained VGG network to ensure the content between the enhanced image and the input image is the same, and (2) quality loss that is formulated as a relativistic hinge adversarial loss to endow the input image the desired characteristics. Both quantitative and qualitative results show that the proposed model effectively improves the aesthetic quality of images. Our code is available at: https://github.com/eezkni/UEGAN.
@article{ni2020towards,
	title={Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network},
	author={Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong},
	journal={IEEE Transactions on Image Processing},
	volume={29},
	pages={9140--9151},
	year={2020},
	publisher={IEEE}
}
Dummy Image
Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network
Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, and Sam Kwong.
In Proceedings of the 28th ACM International Conference on Multimedia (ACM Multimedia), pp. 1697-1705, October 2020
Abstract | Paper | Code | BibTex Abstract: In this work, we aim to learn an unpaired image enhancement model, which can enrich low-quality images with the characteristics of high-quality images provided by users. We propose a quality attention generative adversarial network (QAGAN) trained on unpaired data based on the bidirectional Generative Adversarial Network (GAN) embedded with a quality attention module (QAM). The key novelty of the proposed QAGAN lies in the injected QAM for the generator such that it learns domain-relevant quality attention directly from the two domains. More specifically, the proposed QAM allows the generator to effectively select semantic-related characteristics from the spatial-wise and adaptively incorporate style-related attributes from the channel-wise, respectively. Therefore, in our proposed QAGAN, not only discriminators but also the generator can directly access both domains which significantly facilitate the generator to learn the mapping function. Extensive experimental results show that, compared with the state-of-the-art methods based on unpaired learning, our proposed method achieves better performance in both objective and subjective evaluations.
@article{ni2020towards,
	title={Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network},
	author={Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong},
	booktitle={Proceedings of the 24th ACM international conference on Multimedia},
	pages={xxx--xxx},
	year={2020}
}
Dummy Image
Color Image Demosaicing Using Progressive Collaborative Representation
Zhangkai Ni, Kai-Kuang Ma, Huanqiang Zeng, and Baojiang Zhong.
IEEE Transactions on Image Processing (T-IP), vol. 29, pp. 4952-4964, March 2020.
Abstract | Paper | Code | BibTex Abstract: In this paper, a progressive collaborative representation (PCR) framework is proposed that is able to incorporate any existing color image demosaicing method for further boosting its demosaicing performance. Our PCR consists of two phases: (i) offline training and (ii) online refinement. In phase (i), multiple training-and-refining stages will be performed. In each stage, a new dictionary will be established through the learning of a large number of feature-patch pairs, extracted from the demosaicked images of the current stage and their corresponding original full-color images. After training, a projection matrix will be generated and exploited to refine the current demosaicked image. The updated image with improved image quality will be used as the input for the next training-and-refining stage and performed the same processing likewise. At the end of phase (i), all the projection matrices generated as above-mentioned will be exploited in phase (ii) to conduct online demosaicked image refinement of the test image. Extensive simulations conducted on two commonly-used test datasets (i.e., the IMAX and Kodak) for evaluating the demosaicing algorithms have clearly demonstrated that our proposed PCR framework is able to constantly boost the performance of any image demosaicing method we experimented, in terms of the objective and subjective performance evaluations.
@article{ni2020color,
	title={Color Image Demosaicing Using Progressive Collaborative Representation},
	author={Zhangkai Ni, Kai-Kuang Ma, Huanqiang Zeng, Baojiang Zhong},
	journal={IEEE Transactions on Image Processing},
	volume={29},
	pages={4952--4964},
	year={2020},
	publisher={IEEE}
}
Dummy Image
Just Noticeable Distortion Profile Inference: A Patch-level Structural Visibility Learning Approach
Xuelin Shen, Zhangkai Ni, Wenhan Yang, Shiqi Wang, Xinfeng Zhang, and Sam Kwong.
IEEE Transactions on Image Processing (T-IP), vol. 30, pp. 26-38, November 2020.
Abstract | Paper | Code | Dataset | Project | BibTex Abstract: In this paper, we propose an effective approach to infer the just noticeable distortion (JND) profile based on patch- level structural visibility learning. Instead of pixel-level JND profile estimation, the image patch, which is regarded as the basic processing unit to better correlate with the human perception, can be further decomposed into three conceptually independent components for visibility estimation. In particular, to incorporate the structural degradation into the patch-level JND model, a deep learning-based structural degradation estimation model is trained to approximate the masking of structural visibility. In order to facilitate the learning process, a JND dataset is further established, including 202 pristine images and 7878 distorted images generated by advanced compression algorithms based on the upcoming Versatile Video Coding (VVC) standard. Extensive experimental results further show the superiority of the proposed approach over the state-of-the-art. Our dataset is available at: https://shenxuelin-cityu.github.io/jnd.html.
@article{shen2020just,
	title={Just Noticeable Distortion Profile Inference: A Patch-level Structural Visibility Learning Approach},
	author={Xuelin Shen, Zhangkai Ni, Wenhan Yang, Xinfeng Zhang, Shiqi Wang, and Sam Kwong},
	journal={IEEE Transactions on Image Processing},
	volume={30},
	pages={26--38},
	year={2020},
	publisher={IEEE}
}
Dummy Image
A Gabor Feature-Based Quality Assessment Model for the Screen Content Images
Zhangkai Ni, Huanqiang Zeng, Lin Ma, Junhui Hou, Jing Chen, and Kai-Kuang Ma.
IEEE Transactions on Image Processing (T-IP), vol. 27, no. 9, pp. 4516-4528, September 2018.
Abstract | Paper | Code | Project | BibTex Abstract: In this paper, an accurate and efficient full-reference image quality assessment (IQA) model using the extracted Gabor features, called Gabor feature-based model (GFM), is proposed for conducting objective evaluation of screen content images (SCIs). It is well-known that the Gabor filters are highly consistent with the response of the human visual system (HVS), and the HVS is highly sensitive to the edge information. Based on these facts, the imaginary part of the Gabor filter that has odd symmetry and yields edge detection is exploited to the luminance of the reference and distorted SCI for extracting their Gabor features, respectively. The local similarities of the extracted Gabor features and two chrominance components, recorded in the LMN color space, are then measured independently. Finally, the Gabor-feature pooling strategy is employed to combine these measurements and generate the final evaluation score. Experimental simulation results obtained from two large SCI databases have shown that the proposed GFM model not only yields a higher consistency with the human perception on the assessment of SCIs but also requires a lower computational complexity, compared with that of classical and state-of-the-art IQA models.
@article{ni2018gabor,
    title={A Gabor feature-based quality assessment model for the screen content images},
    author={Ni, Zhangkai and Zeng, Huanqiang and Ma, Lin and Hou, Junhui and Chen, Jing and Ma, Kai-Kuang},
    journal={IEEE Transactions on Image Processing},
    volume={27},
    number={9},
    pages={4516--4528},
    year={2018},
    publisher={IEEE}
}
Dummy Image
Screen Content Image Quality Assessment Using Multi-Scale Difference of Gaussian
Ying Fu, Huanqiang Zeng, Lin Ma, Zhangkai Ni, Jianqing Zhu, and Kai-Kuang Ma.
IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), vol. 28, no. 9, pp. 2428-2432, September 2018.
Abstract | Paper | Code | BibTex Abstract: In this paper, a novel image quality assessment (IQA) model for the screen content images (SCIs) is proposed by using multi-scale difference of Gaussian (MDOG). Motivated by the observation that the human visual system (HVS) is sensitive to the edges while the image details can be better explored in different scales, the proposed model exploits MDOG to effectively characterize the edge information of the reference and distorted SCIs at two different scales, respectively. Then, the degree of edge similarity is measured in terms of the smaller-scale edge map. Finally, the edge strength computed based on the larger-scale edge map is used as the weighting factor to generate the final SCI quality score. Experimental results have shown that the proposed IQA model for the SCIs produces high consistency with human perception of the SCI quality and outperforms the state-of-the-art quality models.
@article{fu2018screen,
    title={Screen content image quality assessment using multi-scale difference of gaussian},
    author={Fu, Ying and Zeng, Huanqiang and Ma, Lin and Ni, Zhangkai and Zhu, Jianqing and Ma, Kai-Kuang},
    journal={IEEE Transactions on Circuits and Systems for Video Technology},
    volume={28},
    number={9},
    pages={2428--2432},
    year={2018},
    publisher={IEEE}
}
Dummy Image
ESIM: Edge Similarity for Screen Content Image Quality Assessment
Zhangkai Ni, Lin Ma, Huanqiang Zeng, Jing Chen, Canhui Cai, and Kai-Kuang Ma.
IEEE Transactions on Image Processing (T-IP), vol. 26, no. 10, pp. 4818-4831, October 2017.
Abstract | Paper | Code | Dataset | Project | BibTex Abstract: In this paper, an accurate full-reference image quality assessment (IQA) model developed for assessing screen content images (SCIs), called the edge similarity (ESIM), is proposed. It is inspired by the fact that the human visual system (HVS) is highly sensitive to edges that are often encountered in SCIs; therefore, essential edge features are extracted and exploited for conducting IQA for the SCIs. The key novelty of the proposed ESIM lies in the extraction and use of three salient edge features-i.e., edge contrast, edge width, and edge direction. The first two attributes are simultaneously generated from the input SCI based on a parametric edge model, while the last one is derived directly from the input SCI. The extraction of these three features will be performed for the reference SCI and the distorted SCI, individually. The degree of similarity measured for each above-mentioned edge attribute is then computed independently, followed by combining them together using our proposed edge-width pooling strategy to generate the final ESIM score. To conduct the performance evaluation of our proposed ESIM model, a new and the largest SCI database (denoted as SCID) is established in our work and made to the public for download. Our database contains 1800 distorted SCIs that are generated from 40 reference SCIs. For each SCI, nine distortion types are investigated, and five degradation levels are produced for each distortion type. Extensive simulation results have clearly shown that the proposed ESIM model is more consistent with the perception of the HVS on the evaluation of distorted SCIs than the multiple state-of-the-art IQA methods.
@article{ni2017esim,
    title={ESIM: Edge similarity for screen content image quality assessment},
    author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Chen, Jing and Cai, Canhui and Ma, Kai-Kuang},
    journal={IEEE Transactions on Image Processing},
    volume={26},
    number={10},
    pages={4818--4831},
    year={2017},
    publisher={IEEE}
}
Dummy Image
Gradient Direction for Screen Content Image Quality Assessment
Zhangkai Ni, Lin Ma, Huanqiang Zeng, Canhui Cai, and Kai-Kuang Ma.
IEEE Signal Processing Letters (SPL), vol. 23, no. 10, pp. 1394–1398, August 2016.
Abstract | Paper | Code | Project | BibTex Abstract: In this letter, we make the first attempt to explore the usage of the gradient direction to conduct the perceptual quality assessment of the screen content images (SCIs). Specifically, the proposed approach first extracts the gradient direction based on the local information of the image gradient magnitude, which not only preserves gradient direction consistency in local regions, but also demonstrates sensitivities to the distortions introduced to the SCI. A deviation-based pooling strategy is subsequently utilized to generate the corresponding image quality index. Moreover, we investigate and demonstrate the complementary behaviors of the gradient direction and magnitude for SCI quality assessment. By jointly considering them together, our proposed SCI quality metric outperforms the state-of-the-art quality metrics in terms of correlation with human visual system perception.
@article{ni2016gradient,
    title={Gradient direction for screen content image quality assessment},
    author={Ni, Zhangkai and Ma, Lin and Zeng, Huanqiang and Cai, Canhui and Ma, Kai-Kuang},
    journal={IEEE Signal Processing Letters},
    volume={23},
    number={10},
    pages={1394--1398},
    year={2016},
    publisher={IEEE}
}

MORE

Academic Services

  • Journal Reviewer
    • IEEE Transactions on Image Processing (T-IP)
    • IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT)
    • Information Sciences (INS)
    • Journal of Visual Communication and Image Representation (JVCI)
    • Signal Processing: Image Communication (SPIC)
    • ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)
    • IET Image Processing
  • Conference Reviewer
    • ACM International Conference on Multimedia (ACM Multimedia): 2019, 2020
    • International Conference on Acoustics, Speech and Signal Processing (ICASSP): 2018, 2019, 2020
    • International Conference on Image Processing: 2016, 2018, 2019

Have a wonderful day!

Top