How to save a web page so it can be viewed offline PROPERLY?

Tom

2009-12-26 07:41:46 UTC

depends on the browser you're using. in IE save as a web archive, single file. in firefox save as web page complete... there is probably another similar option in whatever browser you're using.

White Knight

2014-04-14 01:48:36 UTC

As long as I know, you could save the current webpage to PDF if your browser has the "Print“ Option. And if you're using Firefox, Chrome or Safari, there're also many extensions for letting you save the webpage in HTML. JPG, PDF or other formats. I have been using several methods, but this free screenshot tool seems to be the most convenient one for me: http://screenshot.net/webpage-screenshot

Hope it works.

Sir Pete

2009-12-26 07:41:50 UTC

Save the web page as a web page archive (MHT file).

When saving the web page you should be offered a number of options to save your page, such as HTML file or a web page archive (MHT file). Choose the MHT file and all the information will be included in the file for offline viewing.

A web page is made up of several elements, the code, the images, the style sheets and lots of other parts.

When saving a web page if you only save the HTML file, you only save the code part and not all the other bits.

If you save as a web page archive, you save the code and all of the other components get included in the file for offline viewing.

ItachisXeyes

2009-12-26 07:42:33 UTC

go to View>Page Source

and then the window that pops up go to Save As and your good togo

Edward

2009-12-26 07:55:53 UTC

Tags: Firefox Browser Firefox Extensions Free Software Lynx Browser Spidering 2006, September 5 - 8:05pm — Webmaster Tips

There are many ways to save web pages and web sites for offline viewing. These methods will work on Linux, Windows and/or Mac OS X. These tools will save entire web pages and web sites. If you are looking for a way to take screenshots, try this page instead.

Saving Web Pages for Offline Viewing with Firefox

Firefox has an extension called Scrapbook. Scrapbook lets you edit the saved web pages so you can add notes, highlighting, inline annotations, and more. It is an excellent tool for research.

Saving Web Sites for Offline Viewing with Firefox and Spiderzilla

Spiderzilla was a great Firefox extension that downloaded entire web sites with an embedded version of HTTrack. It looks like you can still download Spiderzilla, but the extension might not be maintained anymore. Worth checking out.

Saving Web Sites with HTTtrack

HTTrack is a classic tool for downloading entire web sites, or parts of web sites. Think carefully before you use this program on someone's web site. If it's a large web site you are going to use up a lot of bandwidth, so don't do it to someone else's web site. Use the Scrapbook Firefox extension, described above, to download individual pages instead.

Saving Web Pages with Lynx in the Terminal

Tip: To install Lynx on Ubuntu/Debian, type sudo apt-get install lynx. If you want to install Lynx on Windows, I recommend using Cygwin. I'm not sure if Lynx comes with Mac OS X, but if it isn't on your Mac you can get the Mac version here.

Lynx is a text-based Web browser. I previously wrote a Lynx tutorial that shows how to extract text from web pages. You can also use Lynx to capture just the text of multiple web pages. It's a bit messy though and I don't recommend it unless you have a specific purpose that needs text extraction from web pages in this manner. Here it is:

First make a test directory:

mkdir lynx_testing

Navigate into that directory:

cd ./lynx_testing

Start the crawl. Don't do this on other people's large web sites because it could use up a lot of bandwidth on a large site.

lynx -crawl -traversal "http://www.[yoursite].com"

You will then end up with a directory full of text files with a .dat file extension.

Tip:You can change the .dat file extensions to .txt with the following command — make sure you are in the right directory first:

rename -v 's/\.dat$/\.txt/' *.dat

Or remove the file extensions altogether with the following command:

rename -v 's/\.dat$//' *.dat

More about the rename command here

Assuming that you are leaving the .dat file extensions for now, this is a list of files and what they contain:

traverse.dat — This file contains a list of URLs that were spidered.

traverse2.dat — This file contains a list of URLs, including the HTML . They are listed in the order encountered. <br /><br />lnk00000###.dat — Each extracted web page will be saved in a numbered file with the HTML titles and URLs at the top. Lynx is a text browser so these files will only contain text content from the web pages The HTML will be stripped out. I've had trouble opening these files from Nautilus, but you can easily open them in the terminal with commands like gedit lnk00000001.dat or vim lnk00000001.dat. <br /><br />Tip: There is more information on the files created with -traversal here <br /><br />If you want to combine all the pages of text into one file for searching with a visual text editor like gedit, SciTE, or Notepad, you can use the cat command like this: <br /><br /> <br /><br />cat * >MyFile.txt <br /><br /> <br /><br />That will create a file called MyFile.txt that contains all of the text from the files in the current directory. <br /><br /> <br /><br />You can also grep (search) the files all at once with the grep command. Navigate to the directory with the files that you want to search and type something like: <br /><br /> <br /><br />grep -i "your search terms" * <br /><br /> <br /><br />The -i will make it a case-insensitive search. For more information on grep, type man grep in the terminal. <br /><br /> <br /><br />GNU Wget <br /><br />Wget information is coming soon, but will be covered in another post. <br /><br /> <br /><br />Summary <br /><br />For saving individual web pages, I recommend the Scrapbook Firefox extension. For downloading and saving entire web sites I recommend HTTrack (don't use it on large web sites though). Wget is great for selectively grabbing files from a Web page/site. If you know of other good tools for saving web pages for offline viewing, leave a comment below. </div> </div>  </div> </br ></br > <div id="se_licensing_notice"> <div id="se_licensing_notice_info"> ⓘ </div> <div id="se_licensing_notice_text"> This content was originally posted on Y! Answers, a Q&A website that shut down in 2021. </div> </div> <div class="ads_show_me_the_borders" id='div-gpt-ad-1628418734326-0' style='min-width: 250px; min-height: 30px; margin-left: 17px;'></div>  <script type="text/javascript"> //$(document).ready(function() { (nk = window.nk || []).session_id = 'fb8bf9d21383'; if(nk.adb === undefined) nk.adb = 'enabled'; if(0) nk.adb = 'untested'; nk.threadfunction = function() { $.ajax({ url: "https://" + document.domain +"/ajax/ThreadViewCounter?hash=ee4oaJZk&nk_session=" + nk.session_id, data: { page_version: 1, is_desktop: 1, lang: 'en', replycount: 5, views: 1, type: 'ya', simqa: 0, simsearchqa: 0, simsearch: 0, url: window.location.href, hash: 'ee4oaJZk', cat: document.domain.replace('.narkive.', '.'), path: window.location.pathname, title: 'How to save a web page so it can be viewed offline PROPERLY?', recency: 3985, ads_filter: 'pass', // adb: nk.adb, }, cache: true, type: "GET" }) .done(function( html ) { eval(html); }); } //}); $(document).ready(function() { if(typeof adsense_dispatcher_id === 'undefined' || adsense_dispatcher_id < 2) { // disable stats collection if you're unlikely to be a new user //nk.threadfunction(); } }); // prebid callback here as there is no command queue, in theory ThreadViewCounter works, but hey var nk_ts_PreBidConnectionTime = Date.now(); function connect_prebid_callback() { if(typeof vmpbjs !== undefined && vmpbjs.onEvent !== undefined) { vmpbjs.onEvent('bidWon', function(prebidwin) { console.log(prebidwin.bidderCode+ ' won the ad server auction for ad unit ' +prebidwin.adUnitCode+ ' at ' +prebidwin.cpm+ ' CPM'); console.log(prebidwin); var u = "https://" + document.domain +"/ajax/telemprebid?abs_sec=" + parseInt((Date.now()-nk_ts_PreBidConnectionTime) / 1000) + "&nk_session=" + nk.session_id; $.ajax({ url: u, cache: false, type: "HEAD", global: false, beforeSend: function(){}, complete: function(){}, data: { bidder: prebidwin.bidder, cpm: prebidwin.originalCpm, currency: prebidwin.originalCurrency, height: prebidwin.height, width: prebidwin.width, slotElementId: prebidwin.adserverTargeting.hb_div_id }}); console.log('callback sent'); }); } else { setTimeout(function() { connect_prebid_callback(); }, 100); } } //connect_prebid_callback(); </script> <div id="last_post_visibility_indicator"> </div> <style> .post_body.parsed { /*padding: 11px 12px;*/ color: #192527; font-family: "Segoe UI","Segoe WP","Arial","Sans-Serif"; font-size: 17px; line-height: 24px; } </style> <style> @media screen and (min-width: 770px) { .post_body.parsed { padding: 11px 12px; color: #192527; font-family: "Segoe UI","Segoe WP","Arial","Sans-Serif"; font-size: 18px; line-height: 26px; } .post_header { border-width: 1px; border-radius: 3px; } .post_wrapper { border-left: 0; padding-left: 22px; } .post_body { border: 0; } .quoted_post { font-size: 14px; line-height: 18px; } .post_header { height: 30px; line-height: 30px; font-size: 17px; } #thread_lister_ctrl_nav { font-size: 24px; line-height: 32px; } #thread_lister_subtitle { display: none; } /* smaller */ #thread_lister_ctrl_nav { font-size: 20px; line-height: 31px; } .post_header { height: 28px; line-height: 27px; font-size: 16px; } .post_body.parsed { font-family: "Segoe UI","Segoe WP","Arial","Sans-Serif"; font-size: 17px; line-height: 25px; padding-bottom: 16px; } .thread #thread_lister_ctrl { margin: 8px 4px; margin-top: 7px; margin-bottom: 4px; } #thread_lister_subtitle { display: block; } #thread_lister_subtitle { margin-bottom: -5px; } #thread_container .post_header_date .timeago { font-size: 15px; } #simthread_first_header { font-family: serif; font-size: 25px; margin-top: 30px; margin-left: 15px; background-color: #f3feff; padding: 7px 13px; color: #234244; width: fit-content; border-radius: 5px; } } </style> <style> #sidebar_banner_right { /*float: right; margin-right: 20px;*/ float: left; margin-left: 40px; margin-top: 20px; width: 22px; height: 30px; /*display: none;*/ /*border: 1px solid red;*/ /*background-color: red;*/ } @media screen and (min-width: 1500px) { #sidebar_banner_right { margin-left: 80px; } } </style> </div> <div id="sidebar_container"></div> <div id="sidebar_banner_right"><div class="sidebar_banner_placeholder_2"></div></div> <div style="clear: both"></div> </div> <script type="text/javascript"> var nav_highlighted = false, sidebar_fixed = false, sidebar_initial_top_distance = 0, sidebar_initial_left_distance = 0, sidebar_right_banner_initial_left_distance = 0, last_post_divs_distance_build = 0, post_divs_distances = Array(), sidebar_height = 0, thread_navigation_height = 0, sidebar_position_offset = 0, thread_onscroll_timeout = 0, sidebar_related_height = 0; function build_post_divs_distance() { if(new Date().getTime() - last_post_divs_distance_build < 1000) { return; } var post_divs = $('.post'); for(k in post_divs) { if(parseInt(k) != k) continue; post_divs_distances[$(post_divs[k]).position().top] = $(post_divs[k]).attr('id').replace('post', ''); } sidebar_height = $('#sidebar_container').height(); thread_navigation_height = $('#thread_navigation').height(); } function thread_onscroll() { if($(window).width() < 995 || typeof($('.sidebar_banner_placeholder_1').offset()) === 'undefined') { return; } build_post_divs_distance(); var scroll_top = $(window).scrollTop(), sidebar_offset_top = $('#sidebar_container').offset().top, windows_height = $(window).height(), sidebar_css_top = parseInt($('#sidebar_container').css('top') == 'auto' ? 0 : $('#sidebar_container').css('top')), sidebar_standard_top_distance = 20, first_visible_post = 1, scroller_margins = windows_height < 250 ? 20 : 100; highlighted_top_distance = nav_highlighted > 0 ? $('#nav_post' + nav_highlighted).position().top : windows_height/2; left_margin = parseInt($('#sidebar_container').css('margin-left')); left_margin_banner = parseInt($('#sidebar_banner_right').css('margin-left')); if(sidebar_initial_left_distance == 0) { sidebar_initial_left_distance = $('#sidebar_container').offset().left/* == 0 ? 760 : $('#sidebar_container').offset().left*/; } if(sidebar_right_banner_initial_left_distance == 0) { sidebar_right_banner_initial_left_distance = $('#sidebar_banner_right').offset().left/* == 0 ? 760 : $('#sidebar_container').offset().left*/; } if(highlighted_top_distance + scroller_margins + sidebar_position_offset + sidebar_related_height > windows_height) { sidebar_position_offset -= windows_height/2; $('#sidebar_container').css('position', 'fixed').css('top', sidebar_position_offset + 'px').css('left', (sidebar_initial_left_distance - left_margin) + 'px'); $('#sidebar_banner_right').css('position', 'fixed').css('top', sidebar_position_offset + 'px').css('left', (sidebar_right_banner_initial_left_distance - left_margin_banner) + 'px'); } else if(highlighted_top_distance + sidebar_position_offset < scroller_margins) { sidebar_position_offset += windows_height/2; $('#sidebar_container').css('position', 'fixed').css('top', sidebar_position_offset + 'px').css('left', (sidebar_initial_left_distance - left_margin) + 'px'); $('#sidebar_banner_right').css('position', 'fixed').css('top', sidebar_position_offset + 'px').css('left', (sidebar_right_banner_initial_left_distance - left_margin_banner) + 'px'); } else if(!sidebar_fixed && scroll_top + sidebar_standard_top_distance > sidebar_offset_top) { $('#sidebar_container').css('position', 'fixed').css('top', '0px').css('left', (sidebar_initial_left_distance - left_margin) + 'px'); $('#sidebar_banner_right').css('position', 'fixed').css('top', '0px').css('left', (sidebar_right_banner_initial_left_distance - left_margin_banner) + 'px'); sidebar_fixed = true; sidebar_initial_top_distance = sidebar_offset_top; } else if(sidebar_fixed && scroll_top + sidebar_standard_top_distance < sidebar_initial_top_distance) { $('#sidebar_container').css('position', 'relative').css('top', '0').css('left', '0'); $('#sidebar_banner_right').css('position', 'relative').css('top', '0').css('left', '0'); sidebar_fixed = false; } for(k in post_divs_distances) if(k < scroll_top + 100 && post_divs_distances[k] != 'selector') first_visible_post = parseInt(post_divs_distances[k]); if(first_visible_post > 0 && first_visible_post <= post_divs_distances.length + 1) { $('#nav_post' + nav_highlighted).css('opacity', ''); $('#nav_post' + first_visible_post).css('opacity', '1'); nav_highlighted = first_visible_post; } clearTimeout(thread_onscroll_timeout); thread_onscroll_timeout = setTimeout(function () {thread_onscroll();}, 200); } function init_quoted_extra() { $(".quoted_post_level_1").each(function() { if($(this).text().length < 500) return; if($(this).next('.quoted_extra_clickable').length != 0) return; $(this).css('display', 'none'); $(this).after('<div class="quoted_extra_clickable">...</div>'); }); } function quoted_extra_click(e) { $(e).css('display', 'none'); $(e).prev().css('display', 'block'); build_post_divs_distance() } $( document ).ready(function() { $(".post_censored").on('click', function(event){ event.stopPropagation(); event.stopImmediatePropagation(); var post_id = $(this).closest(".post").attr('id').replace('post', ''); console.log(post_id); censored_post = $(this).closest(".post_body"); $(this).closest(".post_body").css('opacity', '0.3'); /* $.ajax({ url: "?load_censored_post=" + post_id, type: 'GET', dataType: 'html', success: function(data){ console.log(nk.censored_post); nk.censored_post.html(data); } }); */ $.post(window.location, {'load_censored_post': post_id}, function(data) { // alert('POST was successful. Server says: ' + data); console.log(censored_post); $(censored_post).html(data); $(censored_post).css('opacity', '1'); }); }); }); function thread_selector_mouseup() { } function thread_init_selection() { if(!window.location.hash || !window.location.hash.match(/^#selection:([0-9]+)\.([0-9]+)\.([0-9]+)$/)) return; nk.is_select_ref = 1; var temp = window.location.hash.match(/^#selection:([0-9]+)\.([0-9]+)\.([0-9]+)$/); var post = temp[1], start = temp[2], end = parseInt(start) + parseInt(temp[3]), ts, tr; tr = $('#post' + post).children('.post_body').html().trim(); ts = '<div>' + $('#post' + post).children('.post_body').html().trim() + '</div>'; ts = $(ts); ts.find('.quoted_post').remove(); ts.find('.post_signature').remove(); //ts.find('#selection_url_f').remove(); ts.find('br').replaceWith(' '); ts = ts.html().trim(); var post_verified = '', c; for (var i = 0; i < ts.length; i++) { if(i < start || i >= end) continue; c = ts.charAt(i); post_verified += c; if(c != ' ') continue; if(tr.indexOf(post_verified) != -1) continue; post_verified = post_verified.slice(0, -1); if(tr.indexOf(post_verified + '<br>') != -1) { post_verified += '<br>'; continue; } break; } $('#post' + post).children('.post_body').html( $('#post' + post).children('.post_body').html().replace(post_verified, '<span id="init_selection">' + post_verified + '</span>') ); var offset = (window.innerHeight < $("#init_selection").height()) ? 30 : ((window.innerHeight - $("#init_selection").height()) / 2); $('html,body').animate({scrollTop: $("#init_selection").offset().top - offset},'fast'); ga('send', 'event', 'selection', 'load', {'nonInteraction': 1}); } function thread_init_copy() { $('#thread_container').on("mouseup", function() { var t, te, tr, ts; if($('#selection_url_c:hover').length > 0) { $('#selection_url_c input').select(); ga('send', 'event', 'selection', 'click', {'nonInteraction': 1}); if(typeof nk.push_ts_event !== 'undefined') nk.push_ts_event('selected_link_hover', 1); return; } $('#selection_url_c').remove(); $('#selection_url_f').remove(); if(window.getSelection) t = window.getSelection(); else if(document.getSelection) t = document.getSelection(); else if(document.selection) t = document.selection.createRange().text; if(typeof t === 'undefined' || t.toString().length <= 1) return; if(typeof t.anchorNode === 'undefined' || typeof t.anchorNode.parentElement === 'undefined' || typeof t.anchorNode.parentElement.parentElement === 'undefined') return; te = t.anchorNode.parentElement.parentElement; if(!te.id.match(/^post[0-9]+$/)) return; tr = t.toString().replace(/\n/g, ' ').trim(); ts = '<div>' + $(te).children('.post_body').html().trim() + '</div>'; ts = $(ts); ts.find('.quoted_post').remove(); ts.find('.post_signature').remove(); //ts.find('#selection_url_f').remove(); ts.find('br').replaceWith(' '); ts = ts.html().trim(); if(ts.indexOf(tr) == -1) return; var hash = document.location.href.match(/\.narkive\.com\/([a-zA-Z0-9]{8})/); if (typeof hash[1] === 'undefined') return; hash = hash[1]; var fix_index = 0; if(ts.indexOf('init_selection') < ts.indexOf(tr) && ts.indexOf('init_selection') != -1) fix_index = 33; //console.log(ts); //console.log(fix_index); if(typeof nk.push_ts_event !== 'undefined') nk.push_ts_event('selected_char_count', tr.length); var url = 'https://narkive.com/' + hash + ':' + te.id.replace('post', '') + '.' + (ts.indexOf(tr) - fix_index) + '.' + tr.length; setTimeout(function() { try { var range = document.createRange(); range.setStart(t.focusNode, t.focusOffset); range.insertNode($('<span id="selection_url_f"></span>').get(0)); var selection_button = $('<div style="top: ' + $('#selection_url_f').offset().top + 'px" id="selection_url_c">Selection Permalink:<input type="text" value="'+url+'"></div>').get(0); $('#thread_container').append(selection_button); //console.log(selection_button); } catch (e) {} }, 10); }); } /* function thread_similarbar(hash) { if(/Android|webOS|iPhone|iPad|iPod|BlackBerry/i.test(navigator.userAgent)) return; $.get( "https://" + document.domain + "/ajax/similarbar?hash=" + hash, function( data ) { $('#thread_navigation').after(data); }); } */ function thread_suggested_reading(hash) { if(/Android|webOS|iPhone|iPad|iPod|BlackBerry/i.test(navigator.userAgent)) return; $.get( "https://" + document.domain + "/ajax/suggestedreading?hash=" + hash, function( data ) { $('#thread_container').append(data); }); } $( window ).scroll(function () { thread_onscroll(); }); $( document ).ready(function() { init_quoted_extra(); thread_onscroll(); /*thread_check_adv();*/ thread_init_copy(); thread_init_selection(); }); </script> <script type="text/javascript"> /* thread_similarbar('ee4oaJZk');*/ /* thread_suggested_reading('ee4oaJZk'); */ </script> <style type="text/css"> </style> <script type="text/javascript"> $( document ).ready(function() { var targetNodes = $(".adsbygoogle"); var MutationObserver = window.MutationObserver || window.WebKitMutationObserver; var myObserver = new MutationObserver (mutationHandler); var obsConfig = { attributes: true }; targetNodes.each ( function () { myObserver.observe (this, obsConfig); } ); function mutationHandler (mutationRecords) { mutationRecords.forEach ( function (mutation) { if(mutation.type == 'attributes' && mutation.attributeName == 'data-ad-status') { var jq = $(mutation.target).attr('data-ad-status'); if(jq == 'filled') { nk.adsense_status.filled++; } if(jq == 'unfilled') { nk.adsense_status.unfilled++; } nk.ts_suggestPushEvent = 1; } } ); } var testURL = "https://" + document.domain + "/s/_adverts.js" var myInit = { method: 'HEAD', mode: 'no-cors' }; var myRequest = new Request(testURL, myInit); fetch(myRequest).then(function(response) { return response; }).then(function(response) { nk.adb = 'disabled'; nk.ts_suggestPushEvent = 1; }).catch(function(e){ nk.adb = 'enabled'; nk.ts_suggestPushEvent = 1; }); }); </script> <div style="height: 50px"></div> </div> <div id="footer_wrapper"> <div id="footer_links"> <a href="https://narkive.com/about">about</a> - <a href="https://narkive.com/legalese">legalese</a> </div>  </div> <div id="ajax_loading">Loading...</div> <div id="overflow"></div> <div id="overflow_message"> <div id="overflow_message_header"> </div> <div id="overflow_message_content"> </div> <div id="overflow_message_actions"> </div> </div> <script async data-id="101477562" src="//static.getclicky.com/js"></script> </body></html>