Main Tutorials

How to get URL content in Java

In this Java example, we show you how to get content of a page from URL “mkyong.com” and save it into local file drive, named “test.html”.


package com.mkyong;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;

public class GetURLContent {
	public static void main(String[] args) {

		URL url;

		try {
			// get URL content
			url = new URL("https://mkyong.com");
			URLConnection conn = url.openConnection();

			// open the stream and put it into BufferedReader
			BufferedReader br = new BufferedReader(
                               new InputStreamReader(conn.getInputStream()));

			String inputLine;

			//save to this filename
			String fileName = "/users/mkyong/test.html";
			File file = new File(fileName);

			if (!file.exists()) {
				file.createNewFile();
			}

			//use FileWriter to write file
			FileWriter fw = new FileWriter(file.getAbsoluteFile());
			BufferedWriter bw = new BufferedWriter(fw);

			while ((inputLine = br.readLine()) != null) {
				bw.write(inputLine);
			}

			bw.close();
			br.close();

			System.out.println("Done");

		} catch (MalformedURLException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}

	}
}

About Author

author image
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter. If you like my tutorials, consider make a donation to these charities.

Comments

Subscribe
Notify of
37 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
Himaja bonda
4 years ago

Yes,I know to extract links from a URL but practically I don’t knw how to implement a code to extract data from the obtained links…please help me with this anyone

Adam
7 years ago

I have written a program that takes data from a certain API. But I want to continuously fetch data from the API and save it to a file such that I append the data (with some modifications) I get in the second instance to the first one, and so on, and that too for over thousand plus requests. A quick reply would be appreciated.

Shafraz
8 years ago

hello, I want to retrieve the title from mvc framework which using the file tiles.xml to assign title and set body content.
I only have to pass trough the url to be able to retrieve the tiltle.
Any idea please?
thanks in advance

pushkar kamra
8 years ago

i need to get all the webpages of the websites no just the current webpage

umesh sharma
9 years ago

what is the junit testing case for above program?

Rudradev Pathak
10 years ago

I want to read only text content from web page not java script, css html tag.So how should we the code.
used so many pattern to replace all the things as a space, but its not working.

Sama Ansari
10 years ago

i need help in peer to peer networking in android language.. plz help me

Mittal
10 years ago

Hi MKYONG

how to test this program with JUNIT TEST.Please tell me the steps.

nick
9 years ago
Reply to  Mittal

hey did u get the JUNIT Test Case for dis program then plz do mail me on [email protected] ASAP… ITs urgent

Shrikant
10 years ago

Using this code I am able to get the source code of site but the source code I found in not complete some part of the site is messing
Please give me a solution that I will get the full source code from the site.

Dinesh
10 years ago
Reply to  Shrikant

i am also getting only part of the HTML source.. I am using BufferedReader to read the inputStream..

Anand
10 years ago

Hi,
I’m trying to save the html content from web services to sd card in android,and i have the list of url’s that each contain corresponding html page how to download it to web services to sd card in android.

Harsha
10 years ago

I am getting

java.io.IOException: Access is denied exception.

Is there a way to bypass authentication

Harsha
10 years ago
Reply to  Harsha

Sorry thats working, looks like I had permision issues with creating file.

Shmilfke
11 years ago

How to I get it so check the website at regular times, for example, every five minutes?

Thanks!

vivek ghavle
10 years ago
Reply to  Shmilfke

just put the code line

main(args);

at the end of the program.

Kurret
11 years ago

Thank`s, that was really good! You do a great job!

peter
11 years ago

Hi,
Thanks for your mkyong , but this code doesnt work for https sites
Any idea how to go about it

rasul
11 years ago

Thank You from these informations

??????? ?????????
11 years ago

nice post,
thanx

Jonathan
11 years ago

I want to write a program that gets my notifications from facebook. Do you think the code above would work?

Jonathan
11 years ago
Reply to  mkyong

Thank you, you are right . I ended up using restFB.

Jawahar
11 years ago

Hi,

When I try the above program, I am getting error as below:

java.net.ConnectException: Connection timed out: connect

Please help

Cristian Rivera
11 years ago
Reply to  Jawahar

Im sorry about my last reply. I misread your comment. A connection timeout can mostly occur if you have an inconsistent internet connection or if the site is having trouble. I just tried out the code a few minutes ago and i had no problems.

Cristian Rivera
11 years ago
Reply to  Jawahar

Hello Jawahar, I just tried out the above code and it seemed to work for me. The only part that that wasn’t included with the html document was the sites images. If you don’t mind me asking did you change the out put directory from String fileName = “/users/mkyong/test.html”; to your information?

Guest
6 years ago

Thank you, it works like a charm 🙂

prakash thakur
8 years ago

i want to get data of who’s who r hitting my website..?? how can i..??..plz help..??

sunayana
10 years ago

Hello Sir,

I am getting below error when I execute the above program, please guide.

java.net.ConnectException: Connection timed out: connect

at java.net.DualStackPlainSocketImpl.connect0(Native Method)

at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)

at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)

at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)

at java.net.AbstractPlainSocketImpl.connect(Unknown Source)

at java.net.PlainSocketImpl.connect(Unknown Source)

at java.net.SocksSocketImpl.connect(Unknown Source)

at java.net.Socket.connect(Unknown Source)

at java.net.Socket.connect(Unknown Source)

at sun.net.NetworkClient.doConnect(Unknown Source)

at sun.net.www.http.HttpClient.openServer(Unknown Source)

at sun.net.www.http.HttpClient.openServer(Unknown Source)

at sun.net.www.http.HttpClient.(Unknown Source)

at sun.net.www.http.HttpClient.New(Unknown Source)

at sun.net.www.http.HttpClient.New(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)

at GetURLContent.main(GetURLContent.java:25)

Ram
9 years ago
Reply to  sunayana

Need to check the internet connectivity

Manas Ranjan
10 years ago

Hi,
When I am trying to execute the above code in my local machine(Windows 7) it is working fine, but when I am trying to execute the same code it is giving
java.net.UnknownHostException: https://mkyong.com
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:175)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
at java.net.Socket.connect(Socket.java:546)
at java.net.Socket.connect(Socket.java:495)
at sun.net.NetworkClient.doConnect(NetworkClient.java:178)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:409)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:530)
at sun.net.www.http.HttpClient.(HttpClient.java:240)
at sun.net.www.http.HttpClient.New(HttpClient.java:321)
at sun.net.www.http.HttpClient.New(HttpClient.java:338)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:935)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:876)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:801)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139)
at GetURLContent.main(GetURLContent.java:22)

Can you suggest me what would be the reason for this.
Thanks in Advance.

Mittal
10 years ago
Reply to  Manas Ranjan

please start your Connect your System with Internet and then try

Ajay
10 years ago

Thank for this code, it would be great help if you tell me how to get website content in txt file. I was trying but this code showing the HTML codes with the content please help me out

Thanks

Ashim
11 years ago

I want the reverse case …read file(html content ) from hardisk and display as html file in web (i m using java EE ,hibernate ,jsf2.0,and server glassfish 3+)
Your help will be highly appreciated.

Shmilfke
11 years ago

How do I get the program to do this regularly, for example every five minutes?

Rohan Sethi
8 years ago
Reply to  Shmilfke

Use a timer.