The fusion of depth acquired actively with the depth estimated passively proved its significance as animprovement strategy for gaining depth. This combination allows us to benefit from two sources of modalities suchthat they complement each other. To fuse two sensor data into a more accurate depth map, we must consider thelimitations of active sensing such as low lateral resolution while combining it with a passive depth map. In this paper,we present a novel approach for the fusion of active Time of Flight (ToF) depth and passive stereo depth in an accurateway. We propose a multimodal sensor fusion strategy that is based on a weighted energy optimization problem. Theweights are generated as a result of combining the edge information from a texture map as well as active and passivedepth maps. The objective evaluation of our fusion algorithm shows an improved accuracy of the generated depthmap in comparison with the depth map of every single modality as well as with the results of other fusion methods.Additionally, a visual comparison of our result shows a better recovery on the edges considering the wrong depthvalues estimated in passive stereo. Moreover, the left and right consistency check on the result illustrates the abilityof our approach to consistently fusing sensors.