Researchers have developed a framework based on convolutional neural networks for the automated segmentation and classification of esophageal lesions in endoscopic images. Localization of the lesions was performed using a regional convolutional neural network (R-CNN). Classification of the images into four diagnostic categories was ensured by the two-stream network of esophageal lesions (ELNet). They implemented the segmentation of lesions with a set of three U-Net architectures. The two-stream ELNet achieved a classification accuracy of 92.14%, a specificity of 97.1% and a sensitivity of 88.74%. The U-Net-based segmentation module had an overall accuracy of 95.54% and a lesion segmentation sensitivity of 82.89%. The dual-stream ELNet outperformed the single-stream baseline networks, and the integrated architecture showed better adaptability to different types of lesions. The proposed framework enables accurate and simultaneous classification and segmentation with high clinical potential.